chembl-database

0
0
Source

Query ChEMBL bioactive molecules and drug discovery data. Search compounds by structure/properties, retrieve bioactivity data (IC50, Ki), find inhibitors, perform SAR studies, for medicinal chemistry.

Install

mkdir -p .claude/skills/chembl-database && curl -L -o skill.zip "https://mcp.directory/api/skills/download/5178" && unzip -o skill.zip -d .claude/skills/chembl-database && rm skill.zip

Installs to .claude/skills/chembl-database

About this skill

ChEMBL Database

Overview

ChEMBL is a manually curated database of bioactive molecules maintained by the European Bioinformatics Institute (EBI), containing over 2 million compounds, 19 million bioactivity measurements, 13,000+ drug targets, and data on approved drugs and clinical candidates. Access and query this data programmatically using the ChEMBL Python client for drug discovery and medicinal chemistry research.

When to Use This Skill

This skill should be used when:

  • Compound searches: Finding molecules by name, structure, or properties
  • Target information: Retrieving data about proteins, enzymes, or biological targets
  • Bioactivity data: Querying IC50, Ki, EC50, or other activity measurements
  • Drug information: Looking up approved drugs, mechanisms, or indications
  • Structure searches: Performing similarity or substructure searches
  • Cheminformatics: Analyzing molecular properties and drug-likeness
  • Target-ligand relationships: Exploring compound-target interactions
  • Drug discovery: Identifying inhibitors, agonists, or bioactive molecules

Installation and Setup

Python Client

The ChEMBL Python client is required for programmatic access:

uv pip install chembl_webresource_client

Basic Usage Pattern

from chembl_webresource_client.new_client import new_client

# Access different endpoints
molecule = new_client.molecule
target = new_client.target
activity = new_client.activity
drug = new_client.drug

Core Capabilities

1. Molecule Queries

Retrieve by ChEMBL ID:

molecule = new_client.molecule
aspirin = molecule.get('CHEMBL25')

Search by name:

results = molecule.filter(pref_name__icontains='aspirin')

Filter by properties:

# Find small molecules (MW <= 500) with favorable LogP
results = molecule.filter(
    molecule_properties__mw_freebase__lte=500,
    molecule_properties__alogp__lte=5
)

2. Target Queries

Retrieve target information:

target = new_client.target
egfr = target.get('CHEMBL203')

Search for specific target types:

# Find all kinase targets
kinases = target.filter(
    target_type='SINGLE PROTEIN',
    pref_name__icontains='kinase'
)

3. Bioactivity Data

Query activities for a target:

activity = new_client.activity
# Find potent EGFR inhibitors
results = activity.filter(
    target_chembl_id='CHEMBL203',
    standard_type='IC50',
    standard_value__lte=100,
    standard_units='nM'
)

Get all activities for a compound:

compound_activities = activity.filter(
    molecule_chembl_id='CHEMBL25',
    pchembl_value__isnull=False
)

4. Structure-Based Searches

Similarity search:

similarity = new_client.similarity
# Find compounds similar to aspirin
similar = similarity.filter(
    smiles='CC(=O)Oc1ccccc1C(=O)O',
    similarity=85  # 85% similarity threshold
)

Substructure search:

substructure = new_client.substructure
# Find compounds containing benzene ring
results = substructure.filter(smiles='c1ccccc1')

5. Drug Information

Retrieve drug data:

drug = new_client.drug
drug_info = drug.get('CHEMBL25')

Get mechanisms of action:

mechanism = new_client.mechanism
mechanisms = mechanism.filter(molecule_chembl_id='CHEMBL25')

Query drug indications:

drug_indication = new_client.drug_indication
indications = drug_indication.filter(molecule_chembl_id='CHEMBL25')

Query Workflow

Workflow 1: Finding Inhibitors for a Target

  1. Identify the target by searching by name:

    targets = new_client.target.filter(pref_name__icontains='EGFR')
    target_id = targets[0]['target_chembl_id']
    
  2. Query bioactivity data for that target:

    activities = new_client.activity.filter(
        target_chembl_id=target_id,
        standard_type='IC50',
        standard_value__lte=100
    )
    
  3. Extract compound IDs and retrieve details:

    compound_ids = [act['molecule_chembl_id'] for act in activities]
    compounds = [new_client.molecule.get(cid) for cid in compound_ids]
    

Workflow 2: Analyzing a Known Drug

  1. Get drug information:

    drug_info = new_client.drug.get('CHEMBL1234')
    
  2. Retrieve mechanisms:

    mechanisms = new_client.mechanism.filter(molecule_chembl_id='CHEMBL1234')
    
  3. Find all bioactivities:

    activities = new_client.activity.filter(molecule_chembl_id='CHEMBL1234')
    

Workflow 3: Structure-Activity Relationship (SAR) Study

  1. Find similar compounds:

    similar = new_client.similarity.filter(smiles='query_smiles', similarity=80)
    
  2. Get activities for each compound:

    for compound in similar:
        activities = new_client.activity.filter(
            molecule_chembl_id=compound['molecule_chembl_id']
        )
    
  3. Analyze property-activity relationships using molecular properties from results.

Filter Operators

ChEMBL supports Django-style query filters:

  • __exact - Exact match
  • __iexact - Case-insensitive exact match
  • __contains / __icontains - Substring matching
  • __startswith / __endswith - Prefix/suffix matching
  • __gt, __gte, __lt, __lte - Numeric comparisons
  • __range - Value in range
  • __in - Value in list
  • __isnull - Null/not null check

Data Export and Analysis

Convert results to pandas DataFrame for analysis:

import pandas as pd

activities = new_client.activity.filter(target_chembl_id='CHEMBL203')
df = pd.DataFrame(list(activities))

# Analyze results
print(df['standard_value'].describe())
print(df.groupby('standard_type').size())

Performance Optimization

Caching

The client automatically caches results for 24 hours. Configure caching:

from chembl_webresource_client.settings import Settings

# Disable caching
Settings.Instance().CACHING = False

# Adjust cache expiration (seconds)
Settings.Instance().CACHE_EXPIRE = 86400

Lazy Evaluation

Queries execute only when data is accessed. Convert to list to force execution:

# Query is not executed yet
results = molecule.filter(pref_name__icontains='aspirin')

# Force execution
results_list = list(results)

Pagination

Results are paginated automatically. Iterate through all results:

for activity in new_client.activity.filter(target_chembl_id='CHEMBL203'):
    # Process each activity
    print(activity['molecule_chembl_id'])

Common Use Cases

Find Kinase Inhibitors

# Identify kinase targets
kinases = new_client.target.filter(
    target_type='SINGLE PROTEIN',
    pref_name__icontains='kinase'
)

# Get potent inhibitors
for kinase in kinases[:5]:  # First 5 kinases
    activities = new_client.activity.filter(
        target_chembl_id=kinase['target_chembl_id'],
        standard_type='IC50',
        standard_value__lte=50
    )

Explore Drug Repurposing

# Get approved drugs
drugs = new_client.drug.filter()

# For each drug, find all targets
for drug in drugs[:10]:
    mechanisms = new_client.mechanism.filter(
        molecule_chembl_id=drug['molecule_chembl_id']
    )

Virtual Screening

# Find compounds with desired properties
candidates = new_client.molecule.filter(
    molecule_properties__mw_freebase__range=[300, 500],
    molecule_properties__alogp__lte=5,
    molecule_properties__hba__lte=10,
    molecule_properties__hbd__lte=5
)

Resources

scripts/example_queries.py

Ready-to-use Python functions demonstrating common ChEMBL query patterns:

  • get_molecule_info() - Retrieve molecule details by ID
  • search_molecules_by_name() - Name-based molecule search
  • find_molecules_by_properties() - Property-based filtering
  • get_bioactivity_data() - Query bioactivities for targets
  • find_similar_compounds() - Similarity searching
  • substructure_search() - Substructure matching
  • get_drug_info() - Retrieve drug information
  • find_kinase_inhibitors() - Specialized kinase inhibitor search
  • export_to_dataframe() - Convert results to pandas DataFrame

Consult this script for implementation details and usage examples.

references/api_reference.md

Comprehensive API documentation including:

  • Complete endpoint listing (molecule, target, activity, assay, drug, etc.)
  • All filter operators and query patterns
  • Molecular properties and bioactivity fields
  • Advanced query examples
  • Configuration and performance tuning
  • Error handling and rate limiting

Refer to this document when detailed API information is needed or when troubleshooting queries.

Important Notes

Data Reliability

  • ChEMBL data is manually curated but may contain inconsistencies
  • Always check data_validity_comment field in activity records
  • Be aware of potential_duplicate flags

Units and Standards

  • Bioactivity values use standard units (nM, uM, etc.)
  • pchembl_value provides normalized activity (-log scale)
  • Check standard_type to understand measurement type (IC50, Ki, EC50, etc.)

Rate Limiting

  • Respect ChEMBL's fair usage policies
  • Use caching to minimize repeated requests
  • Consider bulk downloads for large datasets
  • Avoid hammering the API with rapid consecutive requests

Chemical Structure Formats

  • SMILES strings are the primary structure format
  • InChI keys available for compounds
  • SVG images can be generated via the image endpoint

Additional Resources

literature-review

K-Dense-AI

Conduct comprehensive, systematic literature reviews using multiple academic databases (PubMed, arXiv, bioRxiv, Semantic Scholar, etc.). This skill should be used when conducting systematic literature reviews, meta-analyses, research synthesis, or comprehensive literature searches across biomedical, scientific, and technical domains. Creates professionally formatted markdown documents and PDFs with verified citations in multiple citation styles (APA, Nature, Vancouver, etc.).

293144

markitdown

K-Dense-AI

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.

13741

scientific-writing

K-Dense-AI

Write scientific manuscripts. IMRAD structure, citations (APA/AMA/Vancouver), figures/tables, reporting guidelines (CONSORT/STROBE/PRISMA), abstracts, for research papers and journal submissions.

13426

reportlab

K-Dense-AI

"PDF generation toolkit. Create invoices, reports, certificates, forms, charts, tables, barcodes, QR codes, Canvas/Platypus APIs, for professional document automation."

968

matplotlib

K-Dense-AI

Foundational plotting library. Create line plots, scatter, bar, histograms, heatmaps, 3D, subplots, export PNG/PDF/SVG, for scientific visualization and publication figures.

947

drugbank-database

K-Dense-AI

Access and analyze comprehensive drug information from the DrugBank database including drug properties, interactions, targets, pathways, chemical structures, and pharmacology data. This skill should be used when working with pharmaceutical data, drug discovery research, pharmacology studies, drug-drug interaction analysis, target identification, chemical similarity searches, ADMET predictions, or any task requiring detailed drug and drug target information from DrugBank.

945

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

641968

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

590705

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

338397

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

318395

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

450339

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

304231

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.