company-product-context
Compiles comprehensive company product context from PDF documents, web research, and industry knowledge
Install
mkdir -p .claude/skills/company-product-context && curl -L -o skill.zip "https://mcp.directory/api/skills/download/5643" && unzip -o skill.zip -d .claude/skills/company-product-context && rm skill.zipInstalls to .claude/skills/company-product-context
About this skill
Company Product Context Compiler
This skill extracts information from company PDF documents, conducts web research, and synthesizes industry knowledge to create a comprehensive company product context report.
Copy this checklist and track your progress:
Company Product Context Progress:
- [ ] Step 1: Gather company materials and identify sources
- [ ] Step 2: Extract information from PDF documents
- [ ] Step 3: Structure extracted data
- [ ] Step 4: Conduct web research and validation
- [ ] Step 5: Synthesize industry knowledge
- [ ] Step 6: Compile comprehensive product context
- [ ] Step 7: Generate final report
- [ ] Step 8: Export deliverables
Step 1: Gather company materials and identify sources
Collect all available company information:
Required Inputs:
- Company PDF documents (annual reports, product sheets, presentations, etc.)
- Company name and website URL
- Industry/sector information
- Specific products or services to focus on (if applicable)
Actions:
- Request all relevant PDF files from user
- Confirm company name, website, and primary industry
- Ask about specific focus areas or products of interest
- Identify any competitive context needed
Expected in INPUT_DIR:
*.pdf- Company documentscompany_info.txt- Basic company details (optional)
Step 2: Extract information from PDF documents
Extract structured information from all provided PDF files.
Use the Python script for PDF extraction:
import os
import re
from pathlib import Path
import PyPDF2
import json
def extract_pdf_content(pdf_path):
"""Extract text content from PDF file."""
text_content = []
metadata = {}
try:
with open(pdf_path, 'rb') as file:
pdf_reader = PyPDF2.PdfReader(file)
# Extract metadata
if pdf_reader.metadata:
metadata = {
'title': pdf_reader.metadata.get('/Title', ''),
'author': pdf_reader.metadata.get('/Author', ''),
'subject': pdf_reader.metadata.get('/Subject', ''),
'pages': len(pdf_reader.pages)
}
else:
metadata = {'pages': len(pdf_reader.pages)}
# Extract text from all pages
for page_num, page in enumerate(pdf_reader.pages, 1):
try:
text = page.extract_text()
if text.strip():
text_content.append({
'page': page_num,
'text': text
})
except Exception as e:
print(f"Error extracting page {page_num}: {e}")
except Exception as e:
print(f"Error reading PDF {pdf_path}: {e}")
return None
return {
'filename': os.path.basename(pdf_path),
'metadata': metadata,
'content': text_content
}
def extract_key_sections(text):
"""Extract key sections from text based on common headers."""
sections = {
'company_overview': [],
'products_services': [],
'business_model': [],
'market_position': [],
'financials': [],
'technology': [],
'customers': [],
'strategy': [],
'other': []
}
# Keywords for section identification
keywords = {
'company_overview': ['about us', 'company overview', 'who we are', 'introduction', 'history'],
'products_services': ['products', 'services', 'solutions', 'offerings', 'portfolio'],
'business_model': ['business model', 'revenue model', 'how we work', 'operations'],
'market_position': ['market', 'industry', 'competitive', 'position', 'landscape'],
'financials': ['financial', 'revenue', 'earnings', 'profit', 'growth'],
'technology': ['technology', 'platform', 'infrastructure', 'technical', 'innovation'],
'customers': ['customers', 'clients', 'partners', 'case study', 'testimonial'],
'strategy': ['strategy', 'vision', 'mission', 'goals', 'objectives', 'roadmap']
}
lines = text.split('\n')
current_section = 'other'
for line in lines:
line_lower = line.lower().strip()
# Check if line is a section header
for section, section_keywords in keywords.items():
if any(keyword in line_lower for keyword in section_keywords):
if len(line_lower) < 100: # Likely a header
current_section = section
break
if line.strip():
sections[current_section].append(line)
return sections
def analyze_company_info(extracted_data):
"""Analyze extracted data for key company information."""
analysis = {
'company_name': '',
'industry': '',
'products': [],
'key_terms': [],
'metrics': [],
'urls': [],
'emails': []
}
all_text = ''
for doc in extracted_data:
for page in doc['content']:
all_text += page['text'] + '\n'
# Extract URLs
url_pattern = r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'
analysis['urls'] = list(set(re.findall(url_pattern, all_text)))
# Extract emails
email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
analysis['emails'] = list(set(re.findall(email_pattern, all_text)))
# Extract potential metrics (numbers with units/context)
metrics_pattern = r'\$?\d+\.?\d*\s*(?:million|billion|trillion|k|M|B|%|percent|users|customers|employees)'
analysis['metrics'] = re.findall(metrics_pattern, all_text, re.IGNORECASE)
return analysis
def main():
input_dir = os.environ.get('INPUT_DIR', '/tmp')
output_dir = '/tmp/extracted_data'
os.makedirs(output_dir, exist_ok=True)
# Find all PDF files
pdf_files = list(Path(input_dir).glob('*.pdf'))
if not pdf_files:
print("No PDF files found in input directory")
return
print(f"Found {len(pdf_files)} PDF file(s)")
extracted_data = []
for pdf_file in pdf_files:
print(f"\nProcessing: {pdf_file.name}")
data = extract_pdf_content(str(pdf_file))
if data:
extracted_data.append(data)
# Extract sections from content
all_text = '\n'.join([page['text'] for page in data['content']])
sections = extract_key_sections(all_text)
# Save individual file data
output_file = output_dir + f"/{pdf_file.stem}_extracted.json"
with open(output_file, 'w', encoding='utf-8') as f:
json.dump({
'metadata': data['metadata'],
'sections': {k: '\n'.join(v) for k, v in sections.items() if v},
'full_text': all_text
}, f, indent=2, ensure_ascii=False)
print(f"✓ Extracted {len(data['content'])} pages")
print(f"✓ Saved to: {output_file}")
# Analyze all extracted data
if extracted_data:
analysis = analyze_company_info(extracted_data)
analysis_file = output_dir + '/company_analysis.json'
with open(analysis_file, 'w', encoding='utf-8') as f:
json.dump(analysis, f, indent=2, ensure_ascii=False)
print(f"\n✓ Company analysis saved to: {analysis_file}")
print(f"✓ Found {len(analysis['urls'])} URLs")
print(f"✓ Found {len(analysis['emails'])} email addresses")
print(f"✓ Found {len(analysis['metrics'])} metrics")
print(f"\n✓ Extraction complete. All data saved to: {output_dir}")
if __name__ == '__main__':
main()
Execute the extraction:
python3 /tmp/company-product-context/extract_pdfs.py
Outputs:
/tmp/extracted_data/[filename]_extracted.json- Structured data per PDF/tmp/extracted_data/company_analysis.json- Aggregated analysis
Step 3: Structure extracted data
Organize the extracted information into a structured format.
Review extracted data:
# List all extracted files
ls -la /tmp/extracted_data/
# Review company analysis
cat /tmp/extracted_data/company_analysis.json | jq '.'
# Review individual extractions
for file in /tmp/extracted_data/*_extracted.json; do
echo "=== $(basename $file) ==="
cat "$file" | jq '.metadata, .sections | keys'
done
Manually review and note:
- Company name and full legal name
- Core products and services
- Business model and revenue streams
- Target customers and market segments
- Key differentiators
- Technology stack or platform details
- Financial highlights
- Strategic initiatives
Step 4: Conduct web research and validation
Note: This step requires web search capabilities. Based on extracted information:
Research focus areas:
- Company verification: Confirm company details, recent news, press releases
- Product information: Latest product updates, feature sets, pricing
- Market position: Industry reports, analyst coverage, competitive landscape
- Customer base: Case studies, testimonials, major clients
- Technology: Tech stack, integrations, API documentation
- Recent developments: Funding rounds, partnerships, acquisitions
Search queries to execute:
- "[Company Name] official website"
- "[Company Name] products and services"
- "[Company Name] company overview"
- "[Company Name] industry analysis"
- "[Company Name] competitors"
- "[Company Name] case studies"
- "[Company Name] recent news"
- "[Company Name] technology stack"
Document findings in:
# Create research notes file
cat > /tmp/extracted_data/web_research.md << 'EOF'
# Web Research Findings
## Official Sources
- Website: [URL]
- LinkedIn: [URL]
- Documentation: [U
---
*Content truncated.*
More by lofcz
View all skills by lofcz →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversEmpower AI with the Exa MCP Server—an AI research tool for real-time web search, academic data, and smarter, up-to-date
Boost Postgres performance with Postgres MCP Pro—AI-driven index tuning, health checks, and safe, intelligent SQL optimi
Agentic Tools MCP Server: an MCP server for agentic AI tools offering AI project management, AI task management, and pro
Agent Knowledge MCP: Model Context Protocol server combining an Elasticsearch knowledge base with file ops, document val
Boost your AI code assistant with Context7: inject real-time API documentation from OpenAPI specification sources into y
Extend your developer tools with GitHub MCP Server for advanced automation, supporting GitHub Student and student packag
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.