pubchem-database
Query PubChem via PUG-REST API/PubChemPy (110M+ compounds). Search by name/CID/SMILES, retrieve properties, similarity/substructure searches, bioactivity, for cheminformatics.
Install
mkdir -p .claude/skills/pubchem-database && curl -L -o skill.zip "https://mcp.directory/api/skills/download/3694" && unzip -o skill.zip -d .claude/skills/pubchem-database && rm skill.zipInstalls to .claude/skills/pubchem-database
About this skill
PubChem Database
Overview
PubChem is the world's largest freely available chemical database with 110M+ compounds and 270M+ bioactivities. Query chemical structures by name, CID, or SMILES, retrieve molecular properties, perform similarity and substructure searches, access bioactivity data using PUG-REST API and PubChemPy.
When to Use This Skill
This skill should be used when:
- Searching for chemical compounds by name, structure (SMILES/InChI), or molecular formula
- Retrieving molecular properties (MW, LogP, TPSA, hydrogen bonding descriptors)
- Performing similarity searches to find structurally related compounds
- Conducting substructure searches for specific chemical motifs
- Accessing bioactivity data from screening assays
- Converting between chemical identifier formats (CID, SMILES, InChI)
- Batch processing multiple compounds for drug-likeness screening or property analysis
Core Capabilities
1. Chemical Structure Search
Search for compounds using multiple identifier types:
By Chemical Name:
import pubchempy as pcp
compounds = pcp.get_compounds('aspirin', 'name')
compound = compounds[0]
By CID (Compound ID):
compound = pcp.Compound.from_cid(2244) # Aspirin
By SMILES:
compound = pcp.get_compounds('CC(=O)OC1=CC=CC=C1C(=O)O', 'smiles')[0]
By InChI:
compound = pcp.get_compounds('InChI=1S/C9H8O4/...', 'inchi')[0]
By Molecular Formula:
compounds = pcp.get_compounds('C9H8O4', 'formula')
# Returns all compounds matching this formula
2. Property Retrieval
Retrieve molecular properties for compounds using either high-level or low-level approaches:
Using PubChemPy (Recommended):
import pubchempy as pcp
# Get compound object with all properties
compound = pcp.get_compounds('caffeine', 'name')[0]
# Access individual properties
molecular_formula = compound.molecular_formula
molecular_weight = compound.molecular_weight
iupac_name = compound.iupac_name
smiles = compound.canonical_smiles
inchi = compound.inchi
xlogp = compound.xlogp # Partition coefficient
tpsa = compound.tpsa # Topological polar surface area
Get Specific Properties:
# Request only specific properties
properties = pcp.get_properties(
['MolecularFormula', 'MolecularWeight', 'CanonicalSMILES', 'XLogP'],
'aspirin',
'name'
)
# Returns list of dictionaries
Batch Property Retrieval:
import pandas as pd
compound_names = ['aspirin', 'ibuprofen', 'paracetamol']
all_properties = []
for name in compound_names:
props = pcp.get_properties(
['MolecularFormula', 'MolecularWeight', 'XLogP'],
name,
'name'
)
all_properties.extend(props)
df = pd.DataFrame(all_properties)
Available Properties: MolecularFormula, MolecularWeight, CanonicalSMILES, IsomericSMILES, InChI, InChIKey, IUPACName, XLogP, TPSA, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, Complexity, Charge, and many more (see references/api_reference.md for complete list).
3. Similarity Search
Find structurally similar compounds using Tanimoto similarity:
import pubchempy as pcp
# Start with a query compound
query_compound = pcp.get_compounds('gefitinib', 'name')[0]
query_smiles = query_compound.canonical_smiles
# Perform similarity search
similar_compounds = pcp.get_compounds(
query_smiles,
'smiles',
searchtype='similarity',
Threshold=85, # Similarity threshold (0-100)
MaxRecords=50
)
# Process results
for compound in similar_compounds[:10]:
print(f"CID {compound.cid}: {compound.iupac_name}")
print(f" MW: {compound.molecular_weight}")
Note: Similarity searches are asynchronous for large queries and may take 15-30 seconds to complete. PubChemPy handles the asynchronous pattern automatically.
4. Substructure Search
Find compounds containing a specific structural motif:
import pubchempy as pcp
# Search for compounds containing pyridine ring
pyridine_smiles = 'c1ccncc1'
matches = pcp.get_compounds(
pyridine_smiles,
'smiles',
searchtype='substructure',
MaxRecords=100
)
print(f"Found {len(matches)} compounds containing pyridine")
Common Substructures:
- Benzene ring:
c1ccccc1 - Pyridine:
c1ccncc1 - Phenol:
c1ccc(O)cc1 - Carboxylic acid:
C(=O)O
5. Format Conversion
Convert between different chemical structure formats:
import pubchempy as pcp
compound = pcp.get_compounds('aspirin', 'name')[0]
# Convert to different formats
smiles = compound.canonical_smiles
inchi = compound.inchi
inchikey = compound.inchikey
cid = compound.cid
# Download structure files
pcp.download('SDF', 'aspirin', 'name', 'aspirin.sdf', overwrite=True)
pcp.download('JSON', '2244', 'cid', 'aspirin.json', overwrite=True)
6. Structure Visualization
Generate 2D structure images:
import pubchempy as pcp
# Download compound structure as PNG
pcp.download('PNG', 'caffeine', 'name', 'caffeine.png', overwrite=True)
# Using direct URL (via requests)
import requests
cid = 2244 # Aspirin
url = f"https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/{cid}/PNG?image_size=large"
response = requests.get(url)
with open('structure.png', 'wb') as f:
f.write(response.content)
7. Synonym Retrieval
Get all known names and synonyms for a compound:
import pubchempy as pcp
synonyms_data = pcp.get_synonyms('aspirin', 'name')
if synonyms_data:
cid = synonyms_data[0]['CID']
synonyms = synonyms_data[0]['Synonym']
print(f"CID {cid} has {len(synonyms)} synonyms:")
for syn in synonyms[:10]: # First 10
print(f" - {syn}")
8. Bioactivity Data Access
Retrieve biological activity data from assays:
import requests
import json
# Get bioassay summary for a compound
cid = 2244 # Aspirin
url = f"https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/{cid}/assaysummary/JSON"
response = requests.get(url)
if response.status_code == 200:
data = response.json()
# Process bioassay information
table = data.get('Table', {})
rows = table.get('Row', [])
print(f"Found {len(rows)} bioassay records")
For more complex bioactivity queries, use the scripts/bioactivity_query.py helper script which provides:
- Bioassay summaries with activity outcome filtering
- Assay target identification
- Search for compounds by biological target
- Active compound lists for specific assays
9. Comprehensive Compound Annotations
Access detailed compound information through PUG-View:
import requests
cid = 2244
url = f"https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/{cid}/JSON"
response = requests.get(url)
if response.status_code == 200:
annotations = response.json()
# Contains extensive data including:
# - Chemical and Physical Properties
# - Drug and Medication Information
# - Pharmacology and Biochemistry
# - Safety and Hazards
# - Toxicity
# - Literature references
# - Patents
Get Specific Section:
# Get only drug information
url = f"https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/{cid}/JSON?heading=Drug and Medication Information"
Installation Requirements
Install PubChemPy for Python-based access:
uv pip install pubchempy
For direct API access and bioactivity queries:
uv pip install requests
Optional for data analysis:
uv pip install pandas
Helper Scripts
This skill includes Python scripts for common PubChem tasks:
scripts/compound_search.py
Provides utility functions for searching and retrieving compound information:
Key Functions:
search_by_name(name, max_results=10): Search compounds by namesearch_by_smiles(smiles): Search by SMILES stringget_compound_by_cid(cid): Retrieve compound by CIDget_compound_properties(identifier, namespace, properties): Get specific propertiessimilarity_search(smiles, threshold, max_records): Perform similarity searchsubstructure_search(smiles, max_records): Perform substructure searchget_synonyms(identifier, namespace): Get all synonymsbatch_search(identifiers, namespace, properties): Batch search multiple compoundsdownload_structure(identifier, namespace, format, filename): Download structuresprint_compound_info(compound): Print formatted compound information
Usage:
from scripts.compound_search import search_by_name, get_compound_properties
# Search for a compound
compounds = search_by_name('ibuprofen')
# Get specific properties
props = get_compound_properties('aspirin', 'name', ['MolecularWeight', 'XLogP'])
scripts/bioactivity_query.py
Provides functions for retrieving biological activity data:
Key Functions:
get_bioassay_summary(cid): Get bioassay summary for compoundget_compound_bioactivities(cid, activity_outcome): Get filtered bioactivitiesget_assay_description(aid): Get detailed assay informationget_assay_targets(aid): Get biological targets for assaysearch_assays_by_target(target_name, max_results): Find assays by targetget_active_compounds_in_assay(aid, max_results): Get active compoundsget_compound_annotations(cid, section): Get PUG-View annotationssummarize_bioactivities(cid): Generate bioactivity summary statisticsfind_compounds_by_bioactivity(target, threshold, max_compounds): Find compounds by target
Usage:
from scripts.bioactivity_query import get_bioassay_summary, summarize_bioactivities
# Get bioactivity summary
summary = summarize_bioactivities(2244) # Aspirin
print(f"Total assays: {summary['total_assays']}")
print(f"Active: {summary['active']}, Inactive: {summary['inactive']}")
API Rate Limits and Best Practices
Rate Limits:
- Maximum 5 requests per second
- Maximum 400 requests per minute
- Maximum 300 seconds running time per minute
**Best Pr
Content truncated.
More by benchflow-ai
View all skills by benchflow-ai →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversBuild persistent semantic networks for enterprise & engineering data management. Enable data persistence and memory acro
MCP Toolbox for Databases by Google. An open-source server that lets AI agents query Cloud SQL, Spanner, AlloyDB, and ot
Explore official Google BigQuery MCP servers. Find resources and examples to build context-aware apps in Google's ecosys
Connect Supabase projects to AI with Supabase MCP Server. Standardize LLM communication for secure, efficient developmen
Safely connect cloud Grafana to AI agents with MCP: query, inspect, and manage Grafana resources using simple, focused o
Securely join MySQL databases with Read MySQL for read-only query access and in-depth data analysis.
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.