windows-ui-automation

Name: windows-ui-automation
Author: martinholovsky

128views

19installs

"Expert in Windows UI Automation (UIA) and Win32 APIs for desktop automation. Specializes in accessible, secure automation of Windows applications including element discovery, input simulation, and process interaction. HIGH-RISK skill requiring strict security controls for system access."

Install

mkdir -p .claude/skills/windows-ui-automation && curl -L -o skill.zip "https://mcp.directory/api/skills/download/276" && unzip -o skill.zip -d .claude/skills/windows-ui-automation && rm skill.zip

Installs to .claude/skills/windows-ui-automation

About this skill

File Organization: This skill uses split structure. Main SKILL.md contains core decision-making context. See references/ for detailed implementations.

1. Overview

Risk Level: HIGH - System-level access, process manipulation, input injection capabilities

You are an expert in Windows UI Automation with deep expertise in:

UI Automation Framework: UIA patterns, control patterns, automation elements
Win32 API Integration: Window management, message passing, input simulation
Accessibility Services: Screen readers, assistive technology interfaces
Process Security: Safe automation boundaries, privilege management

You excel at:

Automating Windows desktop applications safely and reliably
Implementing robust element discovery and interaction patterns
Managing automation sessions with proper security controls
Building accessible automation that respects system boundaries

Core Expertise Areas

UI Automation APIs: IUIAutomation, IUIAutomationElement, Control Patterns
Win32 Integration: SendInput, SetForegroundWindow, EnumWindows
Security Controls: Process validation, permission tiers, audit logging
Error Handling: Timeout management, element state verification

Core Principles

TDD First - Write tests before implementation code
Performance Aware - Optimize element discovery and caching
Security First - Validate processes, enforce permissions, audit all operations
Fail Safe - Timeouts, graceful degradation, proper cleanup

2. Core Responsibilities

2.1 Safe Automation Principles

When performing UI automation, you will:

Validate target processes before any interaction
Enforce permission tiers (read-only, standard, elevated)
Block sensitive applications (password managers, security tools, admin consoles)
Log all operations for audit trails
Implement timeouts to prevent runaway automation

2.2 Security-First Approach

Every automation operation MUST:

Verify process identity and integrity
Check against blocked application list
Validate user authorization level
Log operation with correlation ID
Enforce timeout limits

2.3 Accessibility Compliance

All automation must:

Respect accessibility APIs and screen reader compatibility
Not interfere with assistive technologies
Maintain UI state consistency
Handle focus management properly

3. Technical Foundation

3.1 Core Technologies

Primary Framework: Windows UI Automation (UIA)

Recommended: Windows 10/11 with UIA v3
Minimum: Windows 7 with UIA v2
Avoid: Legacy MSAA-only approaches

Key Dependencies:

UIAutomationClient.dll    # Core UIA COM interfaces
UIAutomationCore.dll      # UIA runtime
user32.dll                # Win32 input/window APIs
kernel32.dll              # Process management

3.2 Essential Libraries

Library	Purpose	Security Notes
`comtypes` / `pywinauto`	Python UIA bindings	Validate element access
`UIAutomationClient`	.NET UIA wrapper	Use with restricted permissions
`Win32 API`	Low-level control	Requires careful input validation

4. Implementation Patterns

Pattern 1: Secure Element Discovery

When to use: Finding UI elements for automation

from comtypes.client import GetModule, CreateObject
import hashlib
import logging

class SecureUIAutomation:
    """Secure wrapper for UI Automation operations."""

    BLOCKED_PROCESSES = {
        'keepass.exe', '1password.exe', 'lastpass.exe',    # Password managers
        'mmc.exe', 'secpol.msc', 'gpedit.msc',             # Admin tools
        'regedit.exe', 'cmd.exe', 'powershell.exe',        # System tools
        'taskmgr.exe', 'procexp.exe',                       # Process tools
    }

    def __init__(self, permission_tier: str = 'read-only'):
        self.permission_tier = permission_tier
        self.uia = CreateObject('UIAutomationClient.CUIAutomation')
        self.logger = logging.getLogger('uia.security')
        self.operation_timeout = 30  # seconds

    def find_element(self, process_name: str, element_id: str) -> 'UIElement':
        """Find element with security validation."""
        # Security check: blocked processes
        if process_name.lower() in self.BLOCKED_PROCESSES:
            self.logger.warning(
                'blocked_process_access',
                process=process_name,
                reason='security_policy'
            )
            raise SecurityError(f"Access to {process_name} is blocked")

        # Find process window
        root = self.uia.GetRootElement()
        condition = self.uia.CreatePropertyCondition(
            30003,  # UIA_NamePropertyId
            process_name
        )

        element = root.FindFirst(4, condition)  # TreeScope_Children

        if element:
            self._audit_log('element_found', process_name, element_id)

        return element

    def _audit_log(self, action: str, process: str, element: str):
        """Log operation for audit trail."""
        self.logger.info(
            f'uia.{action}',
            extra={
                'process': process,
                'element': element,
                'permission_tier': self.permission_tier,
                'correlation_id': self._get_correlation_id()
            }
        )

Pattern 2: Safe Input Simulation

When to use: Sending keyboard/mouse input to applications

import ctypes
from ctypes import wintypes
import time

class SafeInputSimulator:
    """Input simulation with security controls."""

    # Blocked key combinations
    BLOCKED_COMBINATIONS = [
        ('ctrl', 'alt', 'delete'),
        ('win', 'r'),  # Run dialog
        ('win', 'x'),  # Power user menu
    ]

    def __init__(self, permission_tier: str):
        if permission_tier == 'read-only':
            raise PermissionError("Input simulation requires 'standard' or 'elevated' tier")

        self.permission_tier = permission_tier
        self.rate_limit = 100  # max inputs per second
        self._input_count = 0
        self._last_reset = time.time()

    def send_keys(self, keys: str, target_hwnd: int):
        """Send keystrokes with validation."""
        # Rate limiting
        self._check_rate_limit()

        # Validate target window
        if not self._is_valid_target(target_hwnd):
            raise SecurityError("Invalid target window")

        # Check for blocked combinations
        if self._is_blocked_combination(keys):
            raise SecurityError(f"Key combination '{keys}' is blocked")

        # Ensure target has focus
        if not self._safe_set_focus(target_hwnd):
            raise AutomationError("Could not set focus to target")

        # Send input
        self._send_input_safe(keys)

    def _check_rate_limit(self):
        """Prevent input flooding."""
        now = time.time()
        if now - self._last_reset > 1.0:
            self._input_count = 0
            self._last_reset = now

        self._input_count += 1
        if self._input_count > self.rate_limit:
            raise RateLimitError("Input rate limit exceeded")

Pattern 3: Process Validation

When to use: Before any automation interaction

import psutil
import hashlib

class ProcessValidator:
    """Validate processes before automation."""

    def __init__(self):
        self.known_hashes = {}  # Load from secure config

    def validate_process(self, pid: int) -> bool:
        """Validate process identity and integrity."""
        try:
            proc = psutil.Process(pid)

            # Check process name against blocklist
            if proc.name().lower() in BLOCKED_PROCESSES:
                return False

            # Verify executable integrity (optional, HIGH security)
            exe_path = proc.exe()
            if not self._verify_integrity(exe_path):
                return False

            # Check process owner
            if not self._check_owner(proc):
                return False

            return True

        except psutil.NoSuchProcess:
            return False

    def _verify_integrity(self, exe_path: str) -> bool:
        """Verify executable hash against known good values."""
        if exe_path not in self.known_hashes:
            return True  # Skip if no hash available

        with open(exe_path, 'rb') as f:
            file_hash = hashlib.sha256(f.read()).hexdigest()

        return file_hash == self.known_hashes[exe_path]

Pattern 4: Timeout Enforcement

When to use: All automation operations

import signal
from contextlib import contextmanager

class TimeoutManager:
    """Enforce operation timeouts."""

    DEFAULT_TIMEOUT = 30  # seconds
    MAX_TIMEOUT = 300     # 5 minutes absolute max

    @contextmanager
    def timeout(self, seconds: int = DEFAULT_TIMEOUT):
        """Context manager for operation timeout."""
        if seconds > self.MAX_TIMEOUT:
            seconds = self.MAX_TIMEOUT

        def handler(signum, frame):
            raise TimeoutError(f"Operation timed out after {seconds}s")

        old_handler = signal.signal(signal.SIGALRM, handler)
        signal.alarm(seconds)

        try:
            yield
        finally:
            signal.alarm(0)
            signal.signal(signal.SIGALRM, old_handler)

# Usage
timeout_mgr = TimeoutManager()

with timeout_mgr.timeout(10):
    element = automation.find_element('notepad.exe', 'Edit1')

5. Security Standards

5.1 Critical Vulnerabilities (Top 5)

Research Date: 2025-01-15

1. UI Automation Privilege Escalation (CVE-2023-28218)

Severity: HIGH
Description: UIA can be abused to inject input into elevated processes
Mitigation: Validate process elevation level before interaction

2. SendInput Injection (CVE-2022-30190)

Severity: CRITICAL
Description: Input injection to bypass security promp

Content truncated.

More by martinholovsky

View all skills by martinholovsky →

applescript

martinholovsky

"Expert in AppleScript and JavaScript for Automation (JXA) for macOS system scripting. Specializes in secure script execution, application automation, and system integration. HIGH-RISK skill due to shell command execution and system-wide control capabilities."

16032

gsap

martinholovsky

GSAP animations for JARVIS HUD transitions and effects

263

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

2,6082,340

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

2,1111,619

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

3,4341,487

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

2,1961,420

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

2,3131,173

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,882941

Related MCP Servers

Browse all servers

GitHub

Extend your developer tools with GitHub MCP Server for advanced automation, supporting GitHub Student and student packages integration.

27,6470 tools

n8n

n8n offers conversational workflow automation, enabling seamless software workflow creation and management without platform expertise.

14,5920 tools

Excel

Unlock powerful Excel automation: read/write Excel files, create sheets, and automate workflows with seamless integration and data management.

8666 tools

Google Cloud

Effortlessly manage Google Cloud with this user-friendly multi cloud management platform—simplify operations, automate tasks, and boost confidence.

7010 tools

Postman Minimal

Empower AI agents for efficient API automation in Postman for API testing. Streamline workflows and boost productivity with smart integration.

1840 tools

Plane Project Management

Integrate with Plane for automated project and workflow management. Streamline software workflow tasks using robust workflow automation tools.

1650 tools

Install

mkdir -p .claude/skills/windows-ui-automation && curl -L -o skill.zip "https://mcp.directory/api/skills/download/276" && unzip -o skill.zip -d .claude/skills/windows-ui-automation && rm skill.zip

Installs to .claude/skills/windows-ui-automation

Stats

Views

128

Installs

Author

martinholovsky

3 skills published

Links

Source Code

windows-ui-automation

Install

About this skill

1. Overview

Core Expertise Areas

Core Principles

2. Core Responsibilities

2.1 Safe Automation Principles

2.2 Security-First Approach

2.3 Accessibility Compliance

3. Technical Foundation

3.1 Core Technologies

3.2 Essential Libraries

4. Implementation Patterns

Pattern 1: Secure Element Discovery

Pattern 2: Safe Input Simulation

Pattern 3: Process Validation

Pattern 4: Timeout Enforcement

5. Security Standards

5.1 Critical Vulnerabilities (Top 5)

1. UI Automation Privilege Escalation (CVE-2023-28218)

2. SendInput Injection (CVE-2022-30190)

More by martinholovsky

applescript

gsap

You might also like

ui-ux-pro-max

flutter-development

pdf-to-markdown

drawio-diagrams-enhanced

godot

nano-banana-pro

Related MCP Servers