tooluniverse-sequence-retrieval

0views

1installs

Retrieves biological sequences (DNA, RNA, protein) from NCBI and ENA with gene disambiguation, accession type handling, and comprehensive sequence profiles. Creates detailed reports with sequence metadata, cross-database references, and download options. Use when users need nucleotide sequences, protein sequences, genome data, or mention GenBank, RefSeq, EMBL accessions.

Install

mkdir -p .claude/skills/tooluniverse-sequence-retrieval && curl -L -o skill.zip "https://mcp.directory/api/skills/download/6600" && unzip -o skill.zip -d .claude/skills/tooluniverse-sequence-retrieval && rm skill.zip

Installs to .claude/skills/tooluniverse-sequence-retrieval

About this skill

Biological Sequence Retrieval

Retrieve DNA, RNA, and protein sequences with proper disambiguation and cross-database handling.

IMPORTANT: Always use English terms in tool calls. Only try original-language terms as fallback. Respond in the user's language.

LOOK UP DON'T GUESS: Never assume accession numbers or sequence versions. Always retrieve and verify from NCBI or ENA.

Domain Reasoning

Sequence quality hierarchy: RefSeq (NM_/NP_ = curated) > RefSeq predicted (XM_/XP_) > GenBank (submitted). Prefer the MANE Select transcript for human canonical isoforms. Check version numbers -- annotations improve across versions.

Workflow

Phase 0: Clarify (if needed) → Phase 1: Disambiguate Gene/Organism → Phase 2: Search & Retrieve → Phase 3: Report

Phase 0: Clarification (When Needed)

Ask ONLY if: gene exists in multiple organisms, sequence type unclear, or strain matters. Skip for: specific accessions, clear organism+gene combos, complete genome requests with organism.

Phase 1: Gene/Organism Disambiguation

Accession Type Decision Tree

Prefix	Type	Use With
NC_/NM_/NR_/NP_/XM_	RefSeq	NCBI only
U/M/K/X/CP*/NZ_	GenBank	NCBI or ENA
EMBL format	EMBL	ENA preferred

CRITICAL: Never try ENA tools with RefSeq accessions -- they return 404.

Identity Checklist

Organism confirmed (scientific name)
Gene symbol/name identified
Sequence type determined (genomic/mRNA/protein)
Accession prefix identified for tool selection

Phase 2: Data Retrieval (Internal)

Retrieve silently. Do NOT narrate the search process.

# Search NCBI Nucleotide
result = tu.tools.NCBI_search_nucleotide(
    operation="search", organism=organism, gene=gene,
    strain=strain, keywords=keywords, seq_type=seq_type, limit=10
)

# Get accessions from UIDs
accessions = tu.tools.NCBI_fetch_accessions(operation="fetch_accession", uids=result["data"]["uids"])

# Retrieve sequence (FASTA or GenBank format)
sequence = tu.tools.NCBI_get_sequence(operation="fetch_sequence", accession=accession, format="fasta")

# ENA alternative (non-RefSeq accessions only)
entry = tu.tools.ena_get_entry(accession=accession)
fasta = tu.tools.ena_get_sequence_fasta(accession=accession)

Fallback Chains

Primary	Fallback	Notes
NCBI_get_sequence	ENA (if GenBank format)	NCBI unavailable
ENA_get_entry	NCBI_get_sequence	ENA doesn't have RefSeq
NCBI_search_nucleotide	Try broader keywords	No results

Phase 3: Report Sequence Profile

Present as a Sequence Profile Report. Hide search process. Include:

Search Summary: query, database, result count
Primary Sequence: accession, type (RefSeq/GenBank), organism, strain, length, molecule, topology, curation level
Sequence Preview: first lines of FASTA (truncated)
Annotations Summary: CDS/tRNA/rRNA/regulatory feature counts (from GenBank format)
Alternative Sequences: ranked by relevance and curation, with ENA compatibility
Cross-Database References: RefSeq, GenBank, ENA/EMBL, BioProject, BioSample
Download Options: FASTA (for BLAST/alignment), GenBank (for annotation)

Curation Level Tiers

Tier	Prefix	Description
RefSeq Reference (best)	NC_, NM_, NP_	NCBI-curated, gold standard
RefSeq Predicted	XM_, XP_, XR_	Computationally predicted
GenBank Validated	Various	Submitted, some curation
GenBank Direct	Various	Direct submission
Third Party	TPA_	Third-party annotation

Reasoning Framework

Sequence quality: Prefer RefSeq over GenBank. Check version numbers. Sequences with "PREDICTED" in definition are not experimentally validated.

Accession guidance: RefSeq = NCBI-only. GenBank = mirrored in ENA/EMBL. Default to RefSeq mRNA (NM_) for human/model organisms; most complete genome assembly for microbial queries.

Cross-database reconciliation: Same sequence may have different accessions (e.g., GenBank U00096 = RefSeq NC_000913 for E. coli K-12). Always report both when available. Discrepancies between GenBank/RefSeq typically indicate RefSeq curation corrected submission errors.

Synthesis Questions

What is the highest-quality accession available?
Are there alternative accessions in other databases?
What is the annotation completeness?
Is the sequence from the expected organism/strain?
What download format suits the user's downstream analysis?

Error Handling

Error	Response
"No search criteria provided"	Add organism, gene, or keywords
"ENA 404 error"	Likely RefSeq -- use NCBI only
"No results found"	Broaden search, check spelling, try synonyms
"Sequence too large"	Note size, provide download link instead

Tool Reference

NCBI Tools: NCBI_search_nucleotide (search), NCBI_fetch_accessions (UID→accession), NCBI_get_sequence (retrieve) ENA Tools (GenBank/EMBL only): ena_get_entry (metadata), ena_get_sequence_fasta (FASTA), ena_get_entry_summary (summary)

Search Parameters Reference

NCBI_search_nucleotide: operation="search", organism (scientific name), gene (symbol), strain, keywords, seq_type (complete_genome/mrna/refseq), limit

NCBI_get_sequence: operation="fetch_sequence", accession, format (fasta/genbank)

More by mims-harvard

View all skills by mims-harvard →

tooluniverse-precision-oncology

mims-harvard

Provide actionable treatment recommendations for cancer patients based on molecular profile. Interprets tumor mutations, identifies FDA-approved therapies, finds resistance mechanisms, matches clinical trials. Use when oncologist asks about treatment options for specific mutations (EGFR, KRAS, BRAF, etc.), therapy resistance, or clinical trial eligibility.

203

tooluniverse-drug-research

mims-harvard

Generates comprehensive drug research reports with compound disambiguation, evidence grading, and mandatory completeness sections. Covers identity, chemistry, pharmacology, targets, clinical trials, safety, pharmacogenomics, and ADMET properties. Use when users ask about drugs, medications, therapeutics, or need drug profiling, safety assessment, or clinical development research.

233

tooluniverse-pharmacovigilance

mims-harvard

Analyze drug safety signals from FDA adverse event reports, label warnings, and pharmacogenomic data. Calculates disproportionality measures (PRR, ROR), identifies serious adverse events, assesses pharmacogenomic risk variants. Use when asked about drug safety, adverse events, post-market surveillance, or risk-benefit assessment.

202

tooluniverse-expression-data-retrieval

mims-harvard

Retrieves gene expression and omics datasets from ArrayExpress and BioStudies with gene disambiguation, experiment quality assessment, and structured reports. Creates comprehensive dataset profiles with metadata, sample information, and download links. Use when users need expression data, omics datasets, or mention ArrayExpress (E-MTAB, E-GEOD) or BioStudies (S-BSST) accessions.

172

drug-repurposing

mims-harvard

Identify drug repurposing candidates using ToolUniverse for target-based, compound-based, and disease-driven strategies. Searches existing drugs for new therapeutic indications by analyzing targets, bioactivity, safety profiles, and literature evidence. Use when exploring drug repurposing opportunities, finding new indications for approved drugs, or when users mention drug repositioning, off-label uses, or therapeutic alternatives.

202

tooluniverse-target-research

mims-harvard

Gather comprehensive biological target intelligence from 9 parallel research paths covering protein info, structure, interactions, pathways, expression, variants, drug interactions, and literature. Features collision-aware searches, evidence grading (T1-T4), explicit Open Targets coverage, and mandatory completeness auditing. Use when users ask about drug targets, proteins, genes, or need target validation, druggability assessment, or comprehensive target profiling.

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,6831,428

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

1,2601,320

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,5291,146

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

1,350807

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,262727

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,475681

Related MCP Servers

Browse all servers

UniProt

UniProt — access complete UniProtKB protein entries, sequences, filtered searches and ID mapping across 200+ databases f

25 tools

Docfork

Docfork delivers up-to-date documentation and code examples for any software library, enhancing your config management t

4332 tools

Backlinks (Ahrefs)

Use Backlinks (Ahrefs) for detailed SEO analysis. Check website backlinks, anchor text, domain rating & more with this b

2240 tools

AI Memory

AI Memory is a production-ready vector database server that manages and retrieves contextual knowledge with advanced sem

440 tools

Gmail

Manage your emails effortlessly with Gmail, the email management software that organizes, sends, and retrieves messages

420 tools

BioContextAI Knowledgebase MCP

BioContextAI Knowledgebase MCP — a standardized biomedical knowledgebase API for verified literature and gene/protein da

200 tools

Install

mkdir -p .claude/skills/tooluniverse-sequence-retrieval && curl -L -o skill.zip "https://mcp.directory/api/skills/download/6600" && unzip -o skill.zip -d .claude/skills/tooluniverse-sequence-retrieval && rm skill.zip

Installs to .claude/skills/tooluniverse-sequence-retrieval

Stats

Views

Installs

Author

mims-harvard

7 skills published

Links

Source Code

tooluniverse-sequence-retrieval

Install

About this skill

Biological Sequence Retrieval

Domain Reasoning

Workflow

Phase 0: Clarification (When Needed)

Phase 1: Gene/Organism Disambiguation

Accession Type Decision Tree

Identity Checklist

Phase 2: Data Retrieval (Internal)

Fallback Chains

Phase 3: Report Sequence Profile

Curation Level Tiers

Reasoning Framework

Synthesis Questions

Error Handling

Tool Reference

Search Parameters Reference

More by mims-harvard

tooluniverse-precision-oncology

tooluniverse-drug-research

tooluniverse-pharmacovigilance

tooluniverse-expression-data-retrieval

drug-repurposing

tooluniverse-target-research

You might also like

flutter-development

ui-ux-pro-max

drawio-diagrams-enhanced

godot

nano-banana-pro

pdf-to-markdown

Related MCP Servers