company-product-context

Name: company-product-context
Author: lofcz

by lofcz

6views

1installs

Source

Compiles comprehensive company product context from PDF documents, web research, and industry knowledge

Install

mkdir -p .claude/skills/company-product-context && curl -L -o skill.zip "https://mcp.directory/api/skills/download/5643" && unzip -o skill.zip -d .claude/skills/company-product-context && rm skill.zip

Installs to .claude/skills/company-product-context

About this skill

Company Product Context Compiler

This skill extracts information from company PDF documents, conducts web research, and synthesizes industry knowledge to create a comprehensive company product context report.

Copy this checklist and track your progress:

Company Product Context Progress:
- [ ] Step 1: Gather company materials and identify sources
- [ ] Step 2: Extract information from PDF documents
- [ ] Step 3: Structure extracted data
- [ ] Step 4: Conduct web research and validation
- [ ] Step 5: Synthesize industry knowledge
- [ ] Step 6: Compile comprehensive product context
- [ ] Step 7: Generate final report
- [ ] Step 8: Export deliverables

Step 1: Gather company materials and identify sources

Collect all available company information:

Required Inputs:

Company PDF documents (annual reports, product sheets, presentations, etc.)
Company name and website URL
Industry/sector information
Specific products or services to focus on (if applicable)

Actions:

Request all relevant PDF files from user
Confirm company name, website, and primary industry
Ask about specific focus areas or products of interest
Identify any competitive context needed

Expected in INPUT_DIR:

*.pdf - Company documents
company_info.txt - Basic company details (optional)

Step 2: Extract information from PDF documents

Extract structured information from all provided PDF files.

Use the Python script for PDF extraction:

import os
import re
from pathlib import Path
import PyPDF2
import json

def extract_pdf_content(pdf_path):
    """Extract text content from PDF file."""
    text_content = []
    metadata = {}
    
    try:
        with open(pdf_path, 'rb') as file:
            pdf_reader = PyPDF2.PdfReader(file)
            
            # Extract metadata
            if pdf_reader.metadata:
                metadata = {
                    'title': pdf_reader.metadata.get('/Title', ''),
                    'author': pdf_reader.metadata.get('/Author', ''),
                    'subject': pdf_reader.metadata.get('/Subject', ''),
                    'pages': len(pdf_reader.pages)
                }
            else:
                metadata = {'pages': len(pdf_reader.pages)}
            
            # Extract text from all pages
            for page_num, page in enumerate(pdf_reader.pages, 1):
                try:
                    text = page.extract_text()
                    if text.strip():
                        text_content.append({
                            'page': page_num,
                            'text': text
                        })
                except Exception as e:
                    print(f"Error extracting page {page_num}: {e}")
                    
    except Exception as e:
        print(f"Error reading PDF {pdf_path}: {e}")
        return None
    
    return {
        'filename': os.path.basename(pdf_path),
        'metadata': metadata,
        'content': text_content
    }

def extract_key_sections(text):
    """Extract key sections from text based on common headers."""
    sections = {
        'company_overview': [],
        'products_services': [],
        'business_model': [],
        'market_position': [],
        'financials': [],
        'technology': [],
        'customers': [],
        'strategy': [],
        'other': []
    }
    
    # Keywords for section identification
    keywords = {
        'company_overview': ['about us', 'company overview', 'who we are', 'introduction', 'history'],
        'products_services': ['products', 'services', 'solutions', 'offerings', 'portfolio'],
        'business_model': ['business model', 'revenue model', 'how we work', 'operations'],
        'market_position': ['market', 'industry', 'competitive', 'position', 'landscape'],
        'financials': ['financial', 'revenue', 'earnings', 'profit', 'growth'],
        'technology': ['technology', 'platform', 'infrastructure', 'technical', 'innovation'],
        'customers': ['customers', 'clients', 'partners', 'case study', 'testimonial'],
        'strategy': ['strategy', 'vision', 'mission', 'goals', 'objectives', 'roadmap']
    }
    
    lines = text.split('\n')
    current_section = 'other'
    
    for line in lines:
        line_lower = line.lower().strip()
        
        # Check if line is a section header
        for section, section_keywords in keywords.items():
            if any(keyword in line_lower for keyword in section_keywords):
                if len(line_lower) < 100:  # Likely a header
                    current_section = section
                    break
        
        if line.strip():
            sections[current_section].append(line)
    
    return sections

def analyze_company_info(extracted_data):
    """Analyze extracted data for key company information."""
    analysis = {
        'company_name': '',
        'industry': '',
        'products': [],
        'key_terms': [],
        'metrics': [],
        'urls': [],
        'emails': []
    }
    
    all_text = ''
    for doc in extracted_data:
        for page in doc['content']:
            all_text += page['text'] + '\n'
    
    # Extract URLs
    url_pattern = r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'
    analysis['urls'] = list(set(re.findall(url_pattern, all_text)))
    
    # Extract emails
    email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
    analysis['emails'] = list(set(re.findall(email_pattern, all_text)))
    
    # Extract potential metrics (numbers with units/context)
    metrics_pattern = r'\$?\d+\.?\d*\s*(?:million|billion|trillion|k|M|B|%|percent|users|customers|employees)'
    analysis['metrics'] = re.findall(metrics_pattern, all_text, re.IGNORECASE)
    
    return analysis

def main():
    input_dir = os.environ.get('INPUT_DIR', '/tmp')
    output_dir = '/tmp/extracted_data'
    os.makedirs(output_dir, exist_ok=True)
    
    # Find all PDF files
    pdf_files = list(Path(input_dir).glob('*.pdf'))
    
    if not pdf_files:
        print("No PDF files found in input directory")
        return
    
    print(f"Found {len(pdf_files)} PDF file(s)")
    
    extracted_data = []
    
    for pdf_file in pdf_files:
        print(f"\nProcessing: {pdf_file.name}")
        data = extract_pdf_content(str(pdf_file))
        
        if data:
            extracted_data.append(data)
            
            # Extract sections from content
            all_text = '\n'.join([page['text'] for page in data['content']])
            sections = extract_key_sections(all_text)
            
            # Save individual file data
            output_file = output_dir + f"/{pdf_file.stem}_extracted.json"
            with open(output_file, 'w', encoding='utf-8') as f:
                json.dump({
                    'metadata': data['metadata'],
                    'sections': {k: '\n'.join(v) for k, v in sections.items() if v},
                    'full_text': all_text
                }, f, indent=2, ensure_ascii=False)
            
            print(f"✓ Extracted {len(data['content'])} pages")
            print(f"✓ Saved to: {output_file}")
    
    # Analyze all extracted data
    if extracted_data:
        analysis = analyze_company_info(extracted_data)
        
        analysis_file = output_dir + '/company_analysis.json'
        with open(analysis_file, 'w', encoding='utf-8') as f:
            json.dump(analysis, f, indent=2, ensure_ascii=False)
        
        print(f"\n✓ Company analysis saved to: {analysis_file}")
        print(f"✓ Found {len(analysis['urls'])} URLs")
        print(f"✓ Found {len(analysis['emails'])} email addresses")
        print(f"✓ Found {len(analysis['metrics'])} metrics")
    
    print(f"\n✓ Extraction complete. All data saved to: {output_dir}")

if __name__ == '__main__':
    main()

Execute the extraction:

python3 /tmp/company-product-context/extract_pdfs.py

Outputs:

/tmp/extracted_data/[filename]_extracted.json - Structured data per PDF
/tmp/extracted_data/company_analysis.json - Aggregated analysis

Step 3: Structure extracted data

Organize the extracted information into a structured format.

Review extracted data:

# List all extracted files
ls -la /tmp/extracted_data/

# Review company analysis
cat /tmp/extracted_data/company_analysis.json | jq '.'

# Review individual extractions
for file in /tmp/extracted_data/*_extracted.json; do
    echo "=== $(basename $file) ==="
    cat "$file" | jq '.metadata, .sections | keys'
done

Manually review and note:

Company name and full legal name
Core products and services
Business model and revenue streams
Target customers and market segments
Key differentiators
Technology stack or platform details
Financial highlights
Strategic initiatives

Step 4: Conduct web research and validation

Note: This step requires web search capabilities. Based on extracted information:

Research focus areas:

Company verification: Confirm company details, recent news, press releases
Product information: Latest product updates, feature sets, pricing
Market position: Industry reports, analyst coverage, competitive landscape
Customer base: Case studies, testimonials, major clients
Technology: Tech stack, integrations, API documentation
Recent developments: Funding rounds, partnerships, acquisitions

Search queries to execute:

"[Company Name] official website"
"[Company Name] products and services"
"[Company Name] company overview"
"[Company Name] industry analysis"
"[Company Name] competitors"
"[Company Name] case studies"
"[Company Name] recent news"
"[Company Name] technology stack"

Document findings in:

# Create research notes file
cat > /tmp/extracted_data/web_research.md << 'EOF'
# Web Research Findings

## Official Sources
- Website: [URL]
- LinkedIn: [URL]
- Documentation: [U

---

*Content truncated.*

More by lofcz

View all skills by lofcz →

codebase-context-extractor

lofcz

This skill provides a comprehensive context extraction system for large codebases. It intelligently analyzes code structure, dependencies, and relationships to extract relevant context for understanding, debugging, or modifying code.

264

pdf-processor

lofcz

Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.

deep-researcher

lofcz

Performs comprehensive, multi-layered research on any topic with structured analysis and synthesis of information from multiple sources.

111

llmtornado-tutorial-generator

lofcz

Generates comprehensive code tutorials on LlmTornado API formatted for Medium publication with examples, explanations, and best practices.

ability-generator

lofcz

This skill generates markdown skill templates to be later used.

research-synthesis-workflow

lofcz

A step-by-step guide to synthesizing research from multiple sources into a coherent summary.

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

2,8862,530

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

3,8181,662

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

2,1551,643

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

2,2691,469

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

2,4711,225

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,960969

Related MCP Servers

Browse all servers

Exa Search

Empower AI with the Exa MCP Server—an AI research tool for real-time web search, academic data, and smarter, up-to-date insights.

3,9550 tools

Postgres MCP Pro

Boost Postgres performance with Postgres MCP Pro—AI-driven index tuning, health checks, and safe, intelligent SQL optimization.

2,2920 tools

Agentic Tools MCP Server

Agentic Tools MCP Server: an MCP server for agentic AI tools offering AI project management, AI task management, and project-specific storage.

800 tools

Agent Knowledge MCP

Agent Knowledge MCP: Model Context Protocol server combining an Elasticsearch knowledge base with file ops, document validation and version control for AI…

290 tools

Context7

Boost your AI code assistant with Context7: inject real-time API documentation from OpenAPI specification sources into your coding workflow.

48,1802 tools

GitHub

Extend your developer tools with GitHub MCP Server for advanced automation, supporting GitHub Student and student packages integration.

27,6470 tools

Install

mkdir -p .claude/skills/company-product-context && curl -L -o skill.zip "https://mcp.directory/api/skills/download/5643" && unzip -o skill.zip -d .claude/skills/company-product-context && rm skill.zip

Installs to .claude/skills/company-product-context

Stats

Views

Installs

Author

lofcz

7 skills published

Links

Source Code

company-product-context

Install

About this skill

Company Product Context Compiler

Step 1: Gather company materials and identify sources

Step 2: Extract information from PDF documents

Step 3: Structure extracted data

Step 4: Conduct web research and validation

More by lofcz

codebase-context-extractor

pdf-processor

deep-researcher

llmtornado-tutorial-generator

ability-generator

research-synthesis-workflow

You might also like

ui-ux-pro-max

pdf-to-markdown

flutter-development

drawio-diagrams-enhanced

godot

nano-banana-pro

Related MCP Servers