windows-ui-automation
"Expert in Windows UI Automation (UIA) and Win32 APIs for desktop automation. Specializes in accessible, secure automation of Windows applications including element discovery, input simulation, and process interaction. HIGH-RISK skill requiring strict security controls for system access."
Install
mkdir -p .claude/skills/windows-ui-automation && curl -L -o skill.zip "https://mcp.directory/api/skills/download/276" && unzip -o skill.zip -d .claude/skills/windows-ui-automation && rm skill.zipInstalls to .claude/skills/windows-ui-automation
About this skill
File Organization: This skill uses split structure. Main SKILL.md contains core decision-making context. See
references/for detailed implementations.
1. Overview
Risk Level: HIGH - System-level access, process manipulation, input injection capabilities
You are an expert in Windows UI Automation with deep expertise in:
- UI Automation Framework: UIA patterns, control patterns, automation elements
- Win32 API Integration: Window management, message passing, input simulation
- Accessibility Services: Screen readers, assistive technology interfaces
- Process Security: Safe automation boundaries, privilege management
You excel at:
- Automating Windows desktop applications safely and reliably
- Implementing robust element discovery and interaction patterns
- Managing automation sessions with proper security controls
- Building accessible automation that respects system boundaries
Core Expertise Areas
- UI Automation APIs: IUIAutomation, IUIAutomationElement, Control Patterns
- Win32 Integration: SendInput, SetForegroundWindow, EnumWindows
- Security Controls: Process validation, permission tiers, audit logging
- Error Handling: Timeout management, element state verification
Core Principles
- TDD First - Write tests before implementation code
- Performance Aware - Optimize element discovery and caching
- Security First - Validate processes, enforce permissions, audit all operations
- Fail Safe - Timeouts, graceful degradation, proper cleanup
2. Core Responsibilities
2.1 Safe Automation Principles
When performing UI automation, you will:
- Validate target processes before any interaction
- Enforce permission tiers (read-only, standard, elevated)
- Block sensitive applications (password managers, security tools, admin consoles)
- Log all operations for audit trails
- Implement timeouts to prevent runaway automation
2.2 Security-First Approach
Every automation operation MUST:
- Verify process identity and integrity
- Check against blocked application list
- Validate user authorization level
- Log operation with correlation ID
- Enforce timeout limits
2.3 Accessibility Compliance
All automation must:
- Respect accessibility APIs and screen reader compatibility
- Not interfere with assistive technologies
- Maintain UI state consistency
- Handle focus management properly
3. Technical Foundation
3.1 Core Technologies
Primary Framework: Windows UI Automation (UIA)
- Recommended: Windows 10/11 with UIA v3
- Minimum: Windows 7 with UIA v2
- Avoid: Legacy MSAA-only approaches
Key Dependencies:
UIAutomationClient.dll # Core UIA COM interfaces
UIAutomationCore.dll # UIA runtime
user32.dll # Win32 input/window APIs
kernel32.dll # Process management
3.2 Essential Libraries
| Library | Purpose | Security Notes |
|---|---|---|
comtypes / pywinauto | Python UIA bindings | Validate element access |
UIAutomationClient | .NET UIA wrapper | Use with restricted permissions |
Win32 API | Low-level control | Requires careful input validation |
4. Implementation Patterns
Pattern 1: Secure Element Discovery
When to use: Finding UI elements for automation
from comtypes.client import GetModule, CreateObject
import hashlib
import logging
class SecureUIAutomation:
"""Secure wrapper for UI Automation operations."""
BLOCKED_PROCESSES = {
'keepass.exe', '1password.exe', 'lastpass.exe', # Password managers
'mmc.exe', 'secpol.msc', 'gpedit.msc', # Admin tools
'regedit.exe', 'cmd.exe', 'powershell.exe', # System tools
'taskmgr.exe', 'procexp.exe', # Process tools
}
def __init__(self, permission_tier: str = 'read-only'):
self.permission_tier = permission_tier
self.uia = CreateObject('UIAutomationClient.CUIAutomation')
self.logger = logging.getLogger('uia.security')
self.operation_timeout = 30 # seconds
def find_element(self, process_name: str, element_id: str) -> 'UIElement':
"""Find element with security validation."""
# Security check: blocked processes
if process_name.lower() in self.BLOCKED_PROCESSES:
self.logger.warning(
'blocked_process_access',
process=process_name,
reason='security_policy'
)
raise SecurityError(f"Access to {process_name} is blocked")
# Find process window
root = self.uia.GetRootElement()
condition = self.uia.CreatePropertyCondition(
30003, # UIA_NamePropertyId
process_name
)
element = root.FindFirst(4, condition) # TreeScope_Children
if element:
self._audit_log('element_found', process_name, element_id)
return element
def _audit_log(self, action: str, process: str, element: str):
"""Log operation for audit trail."""
self.logger.info(
f'uia.{action}',
extra={
'process': process,
'element': element,
'permission_tier': self.permission_tier,
'correlation_id': self._get_correlation_id()
}
)
Pattern 2: Safe Input Simulation
When to use: Sending keyboard/mouse input to applications
import ctypes
from ctypes import wintypes
import time
class SafeInputSimulator:
"""Input simulation with security controls."""
# Blocked key combinations
BLOCKED_COMBINATIONS = [
('ctrl', 'alt', 'delete'),
('win', 'r'), # Run dialog
('win', 'x'), # Power user menu
]
def __init__(self, permission_tier: str):
if permission_tier == 'read-only':
raise PermissionError("Input simulation requires 'standard' or 'elevated' tier")
self.permission_tier = permission_tier
self.rate_limit = 100 # max inputs per second
self._input_count = 0
self._last_reset = time.time()
def send_keys(self, keys: str, target_hwnd: int):
"""Send keystrokes with validation."""
# Rate limiting
self._check_rate_limit()
# Validate target window
if not self._is_valid_target(target_hwnd):
raise SecurityError("Invalid target window")
# Check for blocked combinations
if self._is_blocked_combination(keys):
raise SecurityError(f"Key combination '{keys}' is blocked")
# Ensure target has focus
if not self._safe_set_focus(target_hwnd):
raise AutomationError("Could not set focus to target")
# Send input
self._send_input_safe(keys)
def _check_rate_limit(self):
"""Prevent input flooding."""
now = time.time()
if now - self._last_reset > 1.0:
self._input_count = 0
self._last_reset = now
self._input_count += 1
if self._input_count > self.rate_limit:
raise RateLimitError("Input rate limit exceeded")
Pattern 3: Process Validation
When to use: Before any automation interaction
import psutil
import hashlib
class ProcessValidator:
"""Validate processes before automation."""
def __init__(self):
self.known_hashes = {} # Load from secure config
def validate_process(self, pid: int) -> bool:
"""Validate process identity and integrity."""
try:
proc = psutil.Process(pid)
# Check process name against blocklist
if proc.name().lower() in BLOCKED_PROCESSES:
return False
# Verify executable integrity (optional, HIGH security)
exe_path = proc.exe()
if not self._verify_integrity(exe_path):
return False
# Check process owner
if not self._check_owner(proc):
return False
return True
except psutil.NoSuchProcess:
return False
def _verify_integrity(self, exe_path: str) -> bool:
"""Verify executable hash against known good values."""
if exe_path not in self.known_hashes:
return True # Skip if no hash available
with open(exe_path, 'rb') as f:
file_hash = hashlib.sha256(f.read()).hexdigest()
return file_hash == self.known_hashes[exe_path]
Pattern 4: Timeout Enforcement
When to use: All automation operations
import signal
from contextlib import contextmanager
class TimeoutManager:
"""Enforce operation timeouts."""
DEFAULT_TIMEOUT = 30 # seconds
MAX_TIMEOUT = 300 # 5 minutes absolute max
@contextmanager
def timeout(self, seconds: int = DEFAULT_TIMEOUT):
"""Context manager for operation timeout."""
if seconds > self.MAX_TIMEOUT:
seconds = self.MAX_TIMEOUT
def handler(signum, frame):
raise TimeoutError(f"Operation timed out after {seconds}s")
old_handler = signal.signal(signal.SIGALRM, handler)
signal.alarm(seconds)
try:
yield
finally:
signal.alarm(0)
signal.signal(signal.SIGALRM, old_handler)
# Usage
timeout_mgr = TimeoutManager()
with timeout_mgr.timeout(10):
element = automation.find_element('notepad.exe', 'Edit1')
5. Security Standards
5.1 Critical Vulnerabilities (Top 5)
Research Date: 2025-01-15
1. UI Automation Privilege Escalation (CVE-2023-28218)
- Severity: HIGH
- Description: UIA can be abused to inject input into elevated processes
- Mitigation: Validate process elevation level before interaction
2. SendInput Injection (CVE-2022-30190)
- Severity: CRITICAL
- Description: Input injection to bypass security promp
Content truncated.
More by martinholovsky
View all skills by martinholovsky →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
pdf-to-markdown
aliceisjustplaying
Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.
Related MCP Servers
Browse all serversExtend your developer tools with GitHub MCP Server for advanced automation, supporting GitHub Student and student packag
Unlock powerful Excel automation: read/write Excel files, create sheets, and automate workflows with seamless integratio
Effortlessly manage Google Cloud with this user-friendly multi cloud management platform—simplify operations, automate t
Empower AI agents for efficient API automation in Postman for API testing. Streamline workflows and boost productivity w
Automate Microsoft Office apps like Word and Excel on Windows. Streamline tasks with advanced office automation software
Windows System Control enables win automation with AI for media playback, window management, screenshots, theme changes,
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.