tooluniverse-binder-discovery

2
0
Source

Discover novel small molecule binders for protein targets using structure-based and ligand-based approaches. Creates actionable reports with candidate compounds, ADMET profiles, and synthesis feasibility. Use when users ask to find small molecules for a target, identify novel binders, perform virtual screening, or need hit-to-lead compound identification.

Install

mkdir -p .claude/skills/tooluniverse-binder-discovery && curl -L -o skill.zip "https://mcp.directory/api/skills/download/3540" && unzip -o skill.zip -d .claude/skills/tooluniverse-binder-discovery && rm skill.zip

Installs to .claude/skills/tooluniverse-binder-discovery

About this skill

Small Molecule Binder Discovery Strategy

Systematic discovery of novel small molecule binders using 60+ ToolUniverse tools across druggability assessment, known ligand mining, similarity expansion, ADMET filtering, and synthesis feasibility.

KEY PRINCIPLES:

  1. Report-first approach - Create report file FIRST, then populate progressively
  2. Target validation FIRST - Confirm druggability before compound searching
  3. Multi-strategy approach - Combine structure-based and ligand-based methods
  4. ADMET-aware filtering - Eliminate poor compounds early
  5. Evidence grading - Grade candidates by supporting evidence
  6. Actionable output - Provide prioritized candidates with rationale
  7. English-first queries - Always use English terms in tool calls, even if the user writes in another language. Only try original-language terms as a fallback. Respond in the user's language

Critical Workflow Requirements

1. Report-First Approach (MANDATORY)

DO NOT show search process or tool outputs to the user. Instead:

  1. Create the report file FIRST - Before any data collection:

    • File name: [TARGET]_binder_discovery_report.md
    • Initialize with all section headers from the template (see REPORT_TEMPLATE.md)
    • Add placeholder text: [Researching...] in each section
  2. Progressively update the report - As you gather data:

    • Update each section with findings immediately
    • The user sees the report growing, not the search process
  3. Output separate data files:

    • [TARGET]_candidate_compounds.csv - Prioritized compounds with SMILES, scores
    • [TARGET]_bibliography.json - Literature references (optional)

2. Citation Requirements (MANDATORY)

Every piece of information MUST include its source:

*Source: ChEMBL via `ChEMBL_get_target_activities` (CHEMBL203)*
*Source: PDB via `get_protein_metadata_by_pdb_id` (1M17)*
*Source: ADMET-AI via `ADMETAI_predict_toxicity`*
*Source: NVIDIA NIM via `NvidiaNIM_alphafold2` (pLDDT: 90.94)*

Workflow Overview

Phase 0: Tool Verification (check parameter names)
    |
Phase 1: Target Validation
    |- 1.1 Resolve identifiers (UniProt, Ensembl, ChEMBL target ID)
    |- 1.2 Assess druggability/tractability
    |   +- 1.2a GPCRdb integration (for GPCR targets)
    |   +- 1.2.5 Check therapeutic antibodies (Thera-SAbDab)
    |- 1.3 Identify binding sites
    +- 1.4 Predict structure (NvidiaNIM_alphafold2/esmfold)
    |
Phase 2: Known Ligand Mining
    |- ChEMBL bioactivity data
    |- GtoPdb interactions
    |- Chemical probes (Open Targets)
    |- BindingDB affinity data (Ki/IC50/Kd)
    |- PubChem BioAssay HTS data (screening hits)
    +- SAR analysis from known actives
    |
Phase 3: Structure Analysis
    |- PDB structures with ligands
    |- EMDB cryo-EM structures (for membrane targets)
    |- Binding pocket analysis
    +- Key interactions
    |
Phase 3.5: Docking Validation (NvidiaNIM_diffdock/boltz2)
    |- Dock reference inhibitor
    +- Validate binding pocket geometry
    |
Phase 4: Compound Expansion
    |- 4.1-4.3 Similarity/substructure search
    +- 4.4 De novo generation (NvidiaNIM_genmol/molmim)
    |
Phase 5: ADMET Filtering
    |- Physicochemical properties (Lipinski, QED)
    |- Bioavailability, toxicity, CYP interactions
    +- Structural alerts (PAINS)
    |
Phase 6: Candidate Docking & Prioritization
    |- Dock all candidates (NvidiaNIM_diffdock/boltz2)
    |- Score by docking (40%) + ADMET (30%) + similarity (20%) + novelty (10%)
    |- Assess synthesis feasibility
    +- Generate final ranked list (top 20)
    |
Phase 6.5: Literature Evidence
    |- PubMed (peer-reviewed SAR studies)
    |- EuropePMC preprints (source='PPR')
    +- OpenAlex citation analysis
    |
Phase 7: Report Synthesis & Delivery

Phase 0: Tool Verification

CRITICAL: Verify tool parameters before calling unfamiliar tools.

tool_info = tu.tools.get_tool_info(tool_name="ChEMBL_get_target_activities")

Known Parameter Corrections

ToolWRONG ParameterCORRECT Parameter
OpenTargets_*ensembl_idensemblId (camelCase)
ChEMBL_get_target_activitieschembl_target_idtarget_chembl_id
ChEMBL_search_similar_moleculessmilesmolecule (accepts SMILES, ChEMBL ID, or name)
alphafold_get_predictionuniprotaccession
ADMETAI_*smiles="..."smiles=["..."] (must be list)
NvidiaNIM_alphafold2seqsequence
NvidiaNIM_genmolsmiles="C..."smiles="C...[*{1-3}]..." (must have mask)
NvidiaNIM_boltz2sequence="..."polymers=[{"molecule_type": "protein", "sequence": "..."}]

Phase 1: Target Validation

1.1 Identifier Resolution

Resolve all IDs upfront and store for downstream queries:

1. UniProt_search(query=target_name, organism="human") -> UniProt accession
2. MyGene_query_genes(q=gene_symbol, species="human") -> Ensembl gene ID
3. ChEMBL_search_targets(query=target_name, organism="Homo sapiens") -> ChEMBL target ID
4. GtoPdb_get_targets(query=target_name) -> GtoPdb ID (if GPCR/channel/enzyme)

1.2 Druggability Assessment

Use multi-source triangulation:

  • OpenTargets_get_target_tractability_by_ensemblID(ensemblId) - tractability bucket
  • DGIdb_get_gene_druggability(genes=[gene_symbol]) - druggability categories
  • OpenTargets_get_target_classes_by_ensemblID(ensemblId) - target class
  • For GPCRs: GPCRdb_get_protein + GPCRdb_get_ligands + GPCRdb_get_structures
  • For antibody landscape: TheraSAbDab_search_by_target(target=target_name)

Decision Point: If druggability < 2 stars, warn user about challenges.

1.3 Binding Site Analysis

  • ChEMBL_search_binding_sites(target_chembl_id)
  • get_binding_affinity_by_pdb_id(pdb_id) for co-crystallized ligands
  • InterPro_get_protein_domains(accession) for domain architecture

1.4 Structure Prediction (NVIDIA NIM)

Requires NVIDIA_API_KEY. Two options:

  • AlphaFold2: NvidiaNIM_alphafold2(sequence, algorithm="mmseqs2") - high accuracy, 5-15 min
  • ESMFold: NvidiaNIM_esmfold(sequence) - fast (~30s), max 1024 AA

Always report pLDDT confidence scores (>=90 very high, 70-90 confident, <70 caution).


Phase 2: Known Ligand Mining

Tools (in order of priority)

SourceToolStrengths
ChEMBLChEMBL_get_target_activitiesCurated, SAR-ready
BindingDBBindingDB_get_ligands_by_uniprotDirect Ki/Kd, literature links
GtoPdbGtoPdb_get_target_interactionsPharmacology focus (GPCRs, channels)
PubChemPubChem_search_assays_by_target_geneHTS screens, novel scaffolds
Open TargetsOpenTargets_get_chemical_probes_by_target_ensemblIDValidated probes

Key Steps

  1. Get all bioactivities: filter to IC50/Ki/Kd < 10 uM
  2. Get molecule details for top actives: ChEMBL_get_molecule
  3. Identify chemical probes and approved drugs
  4. Analyze SAR: common scaffolds, key modifications
  5. Check off-target selectivity: BindingDB_get_targets_by_compound

Phase 3: Structure Analysis

Tools

  • PDB_search_similar_structures(query=uniprot, type="sequence") - find PDB entries
  • get_protein_metadata_by_pdb_id(pdb_id) - resolution, method
  • get_binding_affinity_by_pdb_id(pdb_id) - co-crystal ligand affinities
  • get_ligand_smiles_by_chem_comp_id(chem_comp_id) - ligand SMILES from PDB
  • emdb_search(query) - cryo-EM structures (prefer for GPCRs, ion channels)
  • alphafold_get_prediction(accession) - AlphaFold DB fallback

Phase 3.5: Docking Validation (NVIDIA NIM)

SituationToolInput
Have PDB + SDFNvidiaNIM_diffdockprotein=PDB, ligand=SDF, num_poses=10
Have sequence + SMILESNvidiaNIM_boltz2polymers=[...], ligands=[...]

Dock a known reference inhibitor first to validate the binding pocket.


Phase 4: Compound Expansion

4.1-4.3 Search-Based Expansion

Use 3-5 diverse actives as seeds, similarity threshold 70-85%:

  • ChEMBL_search_similar_molecules(molecule=SMILES, similarity=70)
  • PubChem_search_compounds_by_similarity(smiles, threshold=0.7)
  • ChEMBL_search_substructure(smiles=core_scaffold)
  • STITCH_get_chemical_protein_interactions(identifier=gene, species=9606)

4.4 De Novo Generation (NVIDIA NIM)

GenMol - scaffold hopping with masked regions:

NvidiaNIM_genmol(smiles="...core...[*{3-8}]...tail...[*{1-3}]...", num_molecules=100, temperature=2.0, scoring="QED")

Mask syntax: [*{min-max}] specifies atom count range.

MolMIM - controlled analog generation:

NvidiaNIM_molmim(smi=reference_smiles, num_molecules=50, algorithm="CMA-ES")

Phase 5: ADMET Filtering

Apply filters sequentially (all take smiles=[list]):

StepToolFilter Criteria
PhysicochemicalADMETAI_predict_physicochemical_propertiesLipinski <= 1, QED > 0.3, MW 200-600
BioavailabilityADMETAI_predict_bioavailabilityOral bioavailability > 0.3
ToxicityADMETAI_predict_toxicityAMES < 0.5, hERG < 0.5, DILI < 0.5
CYPADMETAI_predict_CYP_interactionsFlag CYP3A4 inhibitors
AlertsChEMBL_search_compound_structural_alertsNo PAINS

Include a filter funnel table in the report showing pass/fail counts at each stage.


Phase 6: Candidate Docking & Prioritization

Scoring Framework

DimensionWeightSource
Docking confidence40%NvidiaNIM_diffdock/boltz2
ADMET score30%ADMETAI predictions
Similarity to known active20%Tanimoto coefficient
Novelty10%Not in ChEMBL + novel scaffold bonus

Evidence Tiers

TierCriteria
T0 (4 stars)Docking score > reference inhibitor
T1 (3 stars)Experimental IC50/Ki < 100 nM
T2 (2 stars)Docking within 5% of reference OR IC50 100-1000 nM
T3 (1 star)>80% similarity to T1 compound
T4 (0 stars)70-80% similarity, scaffold match
T5 (empty)Generated molecule, ADMET-passed, no docking

Deliver top 20 candidates with: Rank, ID, SMILES, Docking score, ADMET score, overall score, source, evidence tier.


Phase 6.5: Literature Evidence

  • PubMed_search_articles(query="[TARGET] inhibitor SAR") - peer-reviewed
  • EuropePMC_search_articles(query, source="PPR") - preprints (not peer-reviewed)
  • openalex_search_works(query) - citation analysis

Fallback Chains

Target ID:     ChEMBL_search_targets -> GtoPdb_get_targets -> "Not in databases"
Druggability:  OpenTargets tractability -> DGIdb druggability -> target class proxy
Bioactivity:   ChEMBL -> BindingDB -> GtoPdb -> PubChem BioAssay -> "No data"
Structure:     PDB -> EMDB (membrane) -> NvidiaNIM_alphafold2 -> NvidiaNIM_esmfold -> AlphaFold DB -> "None"
Similarity:    ChEMBL similar -> PubChem similar -> "Search failed"
Docking:       NvidiaNIM_diffdock -> NvidiaNIM_boltz2 -> similarity-based scoring
Generation:    NvidiaNIM_genmol -> NvidiaNIM_molmim -> similarity search only
Literature:    PubMed -> EuropePMC (preprints) -> OpenAlex
GPCR data:     GPCRdb_get_protein -> GtoPdb_get_targets

NVIDIA NIM Runtime Reference

ToolRuntimeNotes
NvidiaNIM_alphafold25-15 minAsync, max ~2000 AA
NvidiaNIM_esmfold~30 secMax 1024 AA
NvidiaNIM_diffdock~1-2 minPer ligand
NvidiaNIM_boltz2~2-5 minEnd-to-end complex
NvidiaNIM_genmol~1-3 minDepends on num_molecules
NvidiaNIM_molmim~1-2 minClose analog generation

Always check: import os; nvidia_available = bool(os.environ.get("NVIDIA_API_KEY"))


Rate Limiting

DatabaseLimitStrategy
ChEMBL~10 req/secBatch queries
PubChem~5 req/secBatch endpoints
ADMET-AINo strict limitBatch SMILES in lists
NVIDIA NIMAPI key quotaCache results

For large expansions (>500 compounds): batch in chunks of 100, prioritize top candidates for docking.


Reference Files

For detailed protocols, examples, and templates, see:

FileContents
WORKFLOW_DETAILS.mdPhase-by-phase procedures, code patterns, screening protocols, fallback chain details
TOOLS_REFERENCE.mdComplete tool reference with parameters, usage examples, and fallback chains
REPORT_TEMPLATE.mdReport file template, evidence grading system, section formatting examples
EXAMPLES.mdEnd-to-end workflow examples (EGFR, novel target, lead optimization, NVIDIA NIM)
CHECKLIST.mdPre-delivery verification checklist for report quality

More by mims-harvard

View all →

tooluniverse-precision-oncology

mims-harvard

Provide actionable treatment recommendations for cancer patients based on molecular profile. Interprets tumor mutations, identifies FDA-approved therapies, finds resistance mechanisms, matches clinical trials. Use when oncologist asks about treatment options for specific mutations (EGFR, KRAS, BRAF, etc.), therapy resistance, or clinical trial eligibility.

150

devtu-fix-tool

mims-harvard

Fix failing ToolUniverse tools by diagnosing test failures, identifying root causes, implementing fixes, and validating solutions. Use when ToolUniverse tools fail tests, return errors, have schema validation issues, or when asked to debug or fix tools in the ToolUniverse framework.

40

tooluniverse-chemical-compound-retrieval

mims-harvard

Retrieves chemical compound information from PubChem and ChEMBL with disambiguation, cross-referencing, and quality assessment. Creates comprehensive compound profiles with identifiers, properties, bioactivity, and drug information. Use when users need chemical data, drug information, or mention PubChem CID, ChEMBL ID, SMILES, InChI, or compound names.

20

tooluniverse-pharmacovigilance

mims-harvard

Analyze drug safety signals from FDA adverse event reports, label warnings, and pharmacogenomic data. Calculates disproportionality measures (PRR, ROR), identifies serious adverse events, assesses pharmacogenomic risk variants. Use when asked about drug safety, adverse events, post-market surveillance, or risk-benefit assessment.

160

tooluniverse-rare-disease-diagnosis

mims-harvard

Provide differential diagnosis for patients with suspected rare diseases based on phenotype and genetic data. Matches symptoms to HPO terms, identifies candidate diseases from Orphanet/OMIM, prioritizes genes for testing, interprets variants of uncertain significance. Use when clinician asks about rare disease diagnosis, unexplained phenotypes, or genetic testing interpretation.

20

devtu-docs-quality

mims-harvard

Comprehensive documentation quality system combining automated validation with ToolUniverse-specific auditing. Detects outdated commands, circular navigation, inconsistent terminology, auto-generated file conflicts, broken links, and structural problems. Use when reviewing documentation, before releases, after refactoring, or when user asks to audit, optimize, or improve documentation quality.

00

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

263781

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

201413

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

181270

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

206230

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

161194

rust-coding-skill

UtakataKyosui

Guides Claude in writing idiomatic, efficient, well-structured Rust code using proper data modeling, traits, impl organization, macros, and build-speed best practices.

162173

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.