tooluniverse-binder-discovery
Discover novel small molecule binders for protein targets using structure-based and ligand-based approaches. Creates actionable reports with candidate compounds, ADMET profiles, and synthesis feasibility. Use when users ask to find small molecules for a target, identify novel binders, perform virtual screening, or need hit-to-lead compound identification.
Install
mkdir -p .claude/skills/tooluniverse-binder-discovery && curl -L -o skill.zip "https://mcp.directory/api/skills/download/3540" && unzip -o skill.zip -d .claude/skills/tooluniverse-binder-discovery && rm skill.zipInstalls to .claude/skills/tooluniverse-binder-discovery
About this skill
Small Molecule Binder Discovery Strategy
Systematic discovery of novel small molecule binders using 60+ ToolUniverse tools across druggability assessment, known ligand mining, similarity expansion, ADMET filtering, and synthesis feasibility.
KEY PRINCIPLES:
- Report-first approach - Create report file FIRST, then populate progressively
- Target validation FIRST - Confirm druggability before compound searching
- Multi-strategy approach - Combine structure-based and ligand-based methods
- ADMET-aware filtering - Eliminate poor compounds early
- Evidence grading - Grade candidates by supporting evidence
- Actionable output - Provide prioritized candidates with rationale
- English-first queries - Always use English terms in tool calls, even if the user writes in another language. Only try original-language terms as a fallback. Respond in the user's language
Critical Workflow Requirements
1. Report-First Approach (MANDATORY)
DO NOT show search process or tool outputs to the user. Instead:
-
Create the report file FIRST - Before any data collection:
- File name:
[TARGET]_binder_discovery_report.md - Initialize with all section headers from the template (see REPORT_TEMPLATE.md)
- Add placeholder text:
[Researching...]in each section
- File name:
-
Progressively update the report - As you gather data:
- Update each section with findings immediately
- The user sees the report growing, not the search process
-
Output separate data files:
[TARGET]_candidate_compounds.csv- Prioritized compounds with SMILES, scores[TARGET]_bibliography.json- Literature references (optional)
2. Citation Requirements (MANDATORY)
Every piece of information MUST include its source:
*Source: ChEMBL via `ChEMBL_get_target_activities` (CHEMBL203)*
*Source: PDB via `get_protein_metadata_by_pdb_id` (1M17)*
*Source: ADMET-AI via `ADMETAI_predict_toxicity`*
*Source: NVIDIA NIM via `NvidiaNIM_alphafold2` (pLDDT: 90.94)*
Workflow Overview
Phase 0: Tool Verification (check parameter names)
|
Phase 1: Target Validation
|- 1.1 Resolve identifiers (UniProt, Ensembl, ChEMBL target ID)
|- 1.2 Assess druggability/tractability
| +- 1.2a GPCRdb integration (for GPCR targets)
| +- 1.2.5 Check therapeutic antibodies (Thera-SAbDab)
|- 1.3 Identify binding sites
+- 1.4 Predict structure (NvidiaNIM_alphafold2/esmfold)
|
Phase 2: Known Ligand Mining
|- ChEMBL bioactivity data
|- GtoPdb interactions
|- Chemical probes (Open Targets)
|- BindingDB affinity data (Ki/IC50/Kd)
|- PubChem BioAssay HTS data (screening hits)
+- SAR analysis from known actives
|
Phase 3: Structure Analysis
|- PDB structures with ligands
|- EMDB cryo-EM structures (for membrane targets)
|- Binding pocket analysis
+- Key interactions
|
Phase 3.5: Docking Validation (NvidiaNIM_diffdock/boltz2)
|- Dock reference inhibitor
+- Validate binding pocket geometry
|
Phase 4: Compound Expansion
|- 4.1-4.3 Similarity/substructure search
+- 4.4 De novo generation (NvidiaNIM_genmol/molmim)
|
Phase 5: ADMET Filtering
|- Physicochemical properties (Lipinski, QED)
|- Bioavailability, toxicity, CYP interactions
+- Structural alerts (PAINS)
|
Phase 6: Candidate Docking & Prioritization
|- Dock all candidates (NvidiaNIM_diffdock/boltz2)
|- Score by docking (40%) + ADMET (30%) + similarity (20%) + novelty (10%)
|- Assess synthesis feasibility
+- Generate final ranked list (top 20)
|
Phase 6.5: Literature Evidence
|- PubMed (peer-reviewed SAR studies)
|- EuropePMC preprints (source='PPR')
+- OpenAlex citation analysis
|
Phase 7: Report Synthesis & Delivery
Phase 0: Tool Verification
CRITICAL: Verify tool parameters before calling unfamiliar tools.
tool_info = tu.tools.get_tool_info(tool_name="ChEMBL_get_target_activities")
Known Parameter Corrections
| Tool | WRONG Parameter | CORRECT Parameter |
|---|---|---|
OpenTargets_* | ensembl_id | ensemblId (camelCase) |
ChEMBL_get_target_activities | chembl_target_id | target_chembl_id |
ChEMBL_search_similar_molecules | smiles | molecule (accepts SMILES, ChEMBL ID, or name) |
alphafold_get_prediction | uniprot | accession |
ADMETAI_* | smiles="..." | smiles=["..."] (must be list) |
NvidiaNIM_alphafold2 | seq | sequence |
NvidiaNIM_genmol | smiles="C..." | smiles="C...[*{1-3}]..." (must have mask) |
NvidiaNIM_boltz2 | sequence="..." | polymers=[{"molecule_type": "protein", "sequence": "..."}] |
Phase 1: Target Validation
1.1 Identifier Resolution
Resolve all IDs upfront and store for downstream queries:
1. UniProt_search(query=target_name, organism="human") -> UniProt accession
2. MyGene_query_genes(q=gene_symbol, species="human") -> Ensembl gene ID
3. ChEMBL_search_targets(query=target_name, organism="Homo sapiens") -> ChEMBL target ID
4. GtoPdb_get_targets(query=target_name) -> GtoPdb ID (if GPCR/channel/enzyme)
1.2 Druggability Assessment
Use multi-source triangulation:
OpenTargets_get_target_tractability_by_ensemblID(ensemblId)- tractability bucketDGIdb_get_gene_druggability(genes=[gene_symbol])- druggability categoriesOpenTargets_get_target_classes_by_ensemblID(ensemblId)- target class- For GPCRs:
GPCRdb_get_protein+GPCRdb_get_ligands+GPCRdb_get_structures - For antibody landscape:
TheraSAbDab_search_by_target(target=target_name)
Decision Point: If druggability < 2 stars, warn user about challenges.
1.3 Binding Site Analysis
ChEMBL_search_binding_sites(target_chembl_id)get_binding_affinity_by_pdb_id(pdb_id)for co-crystallized ligandsInterPro_get_protein_domains(accession)for domain architecture
1.4 Structure Prediction (NVIDIA NIM)
Requires NVIDIA_API_KEY. Two options:
- AlphaFold2:
NvidiaNIM_alphafold2(sequence, algorithm="mmseqs2")- high accuracy, 5-15 min - ESMFold:
NvidiaNIM_esmfold(sequence)- fast (~30s), max 1024 AA
Always report pLDDT confidence scores (>=90 very high, 70-90 confident, <70 caution).
Phase 2: Known Ligand Mining
Tools (in order of priority)
| Source | Tool | Strengths |
|---|---|---|
| ChEMBL | ChEMBL_get_target_activities | Curated, SAR-ready |
| BindingDB | BindingDB_get_ligands_by_uniprot | Direct Ki/Kd, literature links |
| GtoPdb | GtoPdb_get_target_interactions | Pharmacology focus (GPCRs, channels) |
| PubChem | PubChem_search_assays_by_target_gene | HTS screens, novel scaffolds |
| Open Targets | OpenTargets_get_chemical_probes_by_target_ensemblID | Validated probes |
Key Steps
- Get all bioactivities: filter to IC50/Ki/Kd < 10 uM
- Get molecule details for top actives:
ChEMBL_get_molecule - Identify chemical probes and approved drugs
- Analyze SAR: common scaffolds, key modifications
- Check off-target selectivity:
BindingDB_get_targets_by_compound
Phase 3: Structure Analysis
Tools
PDB_search_similar_structures(query=uniprot, type="sequence")- find PDB entriesget_protein_metadata_by_pdb_id(pdb_id)- resolution, methodget_binding_affinity_by_pdb_id(pdb_id)- co-crystal ligand affinitiesget_ligand_smiles_by_chem_comp_id(chem_comp_id)- ligand SMILES from PDBemdb_search(query)- cryo-EM structures (prefer for GPCRs, ion channels)alphafold_get_prediction(accession)- AlphaFold DB fallback
Phase 3.5: Docking Validation (NVIDIA NIM)
| Situation | Tool | Input |
|---|---|---|
| Have PDB + SDF | NvidiaNIM_diffdock | protein=PDB, ligand=SDF, num_poses=10 |
| Have sequence + SMILES | NvidiaNIM_boltz2 | polymers=[...], ligands=[...] |
Dock a known reference inhibitor first to validate the binding pocket.
Phase 4: Compound Expansion
4.1-4.3 Search-Based Expansion
Use 3-5 diverse actives as seeds, similarity threshold 70-85%:
ChEMBL_search_similar_molecules(molecule=SMILES, similarity=70)PubChem_search_compounds_by_similarity(smiles, threshold=0.7)ChEMBL_search_substructure(smiles=core_scaffold)STITCH_get_chemical_protein_interactions(identifier=gene, species=9606)
4.4 De Novo Generation (NVIDIA NIM)
GenMol - scaffold hopping with masked regions:
NvidiaNIM_genmol(smiles="...core...[*{3-8}]...tail...[*{1-3}]...", num_molecules=100, temperature=2.0, scoring="QED")
Mask syntax: [*{min-max}] specifies atom count range.
MolMIM - controlled analog generation:
NvidiaNIM_molmim(smi=reference_smiles, num_molecules=50, algorithm="CMA-ES")
Phase 5: ADMET Filtering
Apply filters sequentially (all take smiles=[list]):
| Step | Tool | Filter Criteria |
|---|---|---|
| Physicochemical | ADMETAI_predict_physicochemical_properties | Lipinski <= 1, QED > 0.3, MW 200-600 |
| Bioavailability | ADMETAI_predict_bioavailability | Oral bioavailability > 0.3 |
| Toxicity | ADMETAI_predict_toxicity | AMES < 0.5, hERG < 0.5, DILI < 0.5 |
| CYP | ADMETAI_predict_CYP_interactions | Flag CYP3A4 inhibitors |
| Alerts | ChEMBL_search_compound_structural_alerts | No PAINS |
Include a filter funnel table in the report showing pass/fail counts at each stage.
Phase 6: Candidate Docking & Prioritization
Scoring Framework
| Dimension | Weight | Source |
|---|---|---|
| Docking confidence | 40% | NvidiaNIM_diffdock/boltz2 |
| ADMET score | 30% | ADMETAI predictions |
| Similarity to known active | 20% | Tanimoto coefficient |
| Novelty | 10% | Not in ChEMBL + novel scaffold bonus |
Evidence Tiers
| Tier | Criteria |
|---|---|
| T0 (4 stars) | Docking score > reference inhibitor |
| T1 (3 stars) | Experimental IC50/Ki < 100 nM |
| T2 (2 stars) | Docking within 5% of refe |
Content truncated.
More by mims-harvard
View all skills by mims-harvard →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
pdf-to-markdown
aliceisjustplaying
Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.
Related MCP Servers
Browse all serversBoost your AI code assistant with Context7: inject real-time API documentation from OpenAPI specification sources into y
Extend your developer tools with GitHub MCP Server for advanced automation, supporting GitHub Student and student packag
Search and discover MCP servers with the official MCP Registry — browse an up-to-date MCP server list to find MCP server
XcodeBuild streamlines iOS app development for Apple developers with tools for building, debugging, and deploying iOS an
Find official MCP servers for Google Maps. Explore resources to build, integrate, and extend apps with Google directions
Explore official Google BigQuery MCP servers. Find resources and examples to build context-aware apps in Google's ecosys
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.