AWS ECS container orchestration for running Docker containers. Use when deploying containerized applications, configuring task definitions, setting up services, managing clusters, or troubleshooting container issues.
Install
mkdir -p .claude/skills/ecs && curl -L -o skill.zip "https://mcp.directory/api/skills/download/2414" && unzip -o skill.zip -d .claude/skills/ecs && rm skill.zipInstalls to .claude/skills/ecs
About this skill
AWS ECS
Amazon Elastic Container Service (ECS) is a fully managed container orchestration service. Run containers on AWS Fargate (serverless) or EC2 instances.
Table of Contents
Core Concepts
Cluster
Logical grouping of tasks or services. Can contain Fargate tasks, EC2 instances, or both.
Task Definition
Blueprint for your application. Defines containers, resources, networking, and IAM roles.
Task
Running instance of a task definition. Can run standalone or as part of a service.
Service
Maintains desired count of tasks. Handles deployments, load balancing, and auto scaling.
Launch Types
| Type | Description | Use Case |
|---|---|---|
| Fargate | Serverless, pay per task | Most workloads |
| EC2 | Self-managed instances | GPU, Windows, specific requirements |
Common Patterns
Create a Fargate Cluster
AWS CLI:
# Create cluster
aws ecs create-cluster --cluster-name my-cluster
# With capacity providers
aws ecs create-cluster \
--cluster-name my-cluster \
--capacity-providers FARGATE FARGATE_SPOT \
--default-capacity-provider-strategy \
capacityProvider=FARGATE,weight=1 \
capacityProvider=FARGATE_SPOT,weight=1
Register Task Definition
cat > task-definition.json << 'EOF'
{
"family": "web-app",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
"containerDefinitions": [
{
"name": "web",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
"portMappings": [
{
"containerPort": 8080,
"protocol": "tcp"
}
],
"environment": [
{"name": "NODE_ENV", "value": "production"}
],
"secrets": [
{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:db-password"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/web-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
}
}
]
}
EOF
aws ecs register-task-definition --cli-input-json file://task-definition.json
Create Service with Load Balancer
aws ecs create-service \
--cluster my-cluster \
--service-name web-service \
--task-definition web-app:1 \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={
subnets=[subnet-12345678,subnet-87654321],
securityGroups=[sg-12345678],
assignPublicIp=DISABLED
}" \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/web-tg/1234567890123456,containerName=web,containerPort=8080" \
--health-check-grace-period-seconds 60
Run Standalone Task
aws ecs run-task \
--cluster my-cluster \
--task-definition my-batch-job:1 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={
subnets=[subnet-12345678],
securityGroups=[sg-12345678],
assignPublicIp=ENABLED
}"
Update Service (Deploy New Image)
# Register new task definition with updated image
aws ecs register-task-definition --cli-input-json file://task-definition.json
# Update service to use new version
aws ecs update-service \
--cluster my-cluster \
--service web-service \
--task-definition web-app:2 \
--force-new-deployment
Auto Scaling
# Register scalable target
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/my-cluster/web-service \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 2 \
--max-capacity 10
# Target tracking policy
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/my-cluster/web-service \
--scalable-dimension ecs:service:DesiredCount \
--policy-name cpu-target-tracking \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 70.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageCPUUtilization"
},
"ScaleOutCooldown": 60,
"ScaleInCooldown": 120
}'
CLI Reference
Cluster Management
| Command | Description |
|---|---|
aws ecs create-cluster | Create cluster |
aws ecs describe-clusters | Get cluster details |
aws ecs list-clusters | List clusters |
aws ecs delete-cluster | Delete cluster |
Task Definitions
| Command | Description |
|---|---|
aws ecs register-task-definition | Create task definition |
aws ecs describe-task-definition | Get task definition |
aws ecs list-task-definitions | List task definitions |
aws ecs deregister-task-definition | Deregister version |
Services
| Command | Description |
|---|---|
aws ecs create-service | Create service |
aws ecs update-service | Update service |
aws ecs describe-services | Get service details |
aws ecs delete-service | Delete service |
Tasks
| Command | Description |
|---|---|
aws ecs run-task | Run standalone task |
aws ecs stop-task | Stop running task |
aws ecs describe-tasks | Get task details |
aws ecs list-tasks | List tasks |
Best Practices
Security
- Use task roles for AWS API access (not access keys)
- Use execution roles for ECR/Secrets access
- Store secrets in Secrets Manager or Parameter Store
- Use private subnets with NAT gateway
- Enable CloudTrail for API auditing
Performance
- Right-size CPU/memory — monitor and adjust
- Use Fargate Spot for fault-tolerant workloads (70% savings)
- Enable container insights for monitoring
- Use service discovery for internal communication
Reliability
- Deploy across multiple AZs
- Configure health checks properly
- Set appropriate deregistration delay
- Use circuit breaker for deployments
aws ecs update-service \
--cluster my-cluster \
--service web-service \
--deployment-configuration '{
"deploymentCircuitBreaker": {
"enable": true,
"rollback": true
}
}'
Cost Optimization
- Use Fargate Spot for batch workloads
- Right-size task resources
- Scale to zero when not needed
- Use capacity providers for mixed Fargate/Spot
Troubleshooting
Task Fails to Start
Check:
# View stopped tasks
aws ecs describe-tasks \
--cluster my-cluster \
--tasks $(aws ecs list-tasks --cluster my-cluster --desired-status STOPPED --query 'taskArns[0]' --output text)
Common causes:
- Image not found (ECR permissions)
- Secrets access denied
- Network configuration (subnets, security groups)
- Resource limits exceeded
Container Keeps Restarting
Debug:
# Check CloudWatch logs
aws logs get-log-events \
--log-group-name /ecs/web-app \
--log-stream-name "ecs/web/abc123"
# Check task details
aws ecs describe-tasks \
--cluster my-cluster \
--tasks task-arn \
--query 'tasks[0].containers[0].{reason:reason,exitCode:exitCode}'
Causes:
- Health check failing
- Application crashing
- Out of memory
Service Stuck Deploying
# Check deployment status
aws ecs describe-services \
--cluster my-cluster \
--services web-service \
--query 'services[0].deployments'
# Check events
aws ecs describe-services \
--cluster my-cluster \
--services web-service \
--query 'services[0].events[:5]'
Causes:
- Health check failing on new tasks
- Not enough capacity
- Target group health checks failing
Cannot Pull Image from ECR
Check execution role has:
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage"
],
"Resource": "*"
}
Also check:
- VPC endpoint for ECR (if private subnet)
- NAT gateway (if private subnet)
- Security group allows HTTPS outbound
References
More by itsmostafa
View all skills by itsmostafa →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversManage containers with Docker and Docker Compose using natural language. Simplify your stacks with easy Docker Compose i
ipybox enables secure Python code execution with stateful IPython kernels, real-time output, file operations, and robust
Run and manage Docker containers with intelligent process management, background task tracking, and portainers for advan
Docker MCP Server — AI-driven Docker container management and automation; manage containers, images, networks, and volum
Cloudflare Container Sandbox lets your MCP client run secure, sandboxed LLM code in Node or Python. Run code safely in t
Replicate Flux is an OpenAPI image generator using Replicate's Flux model, enabling image creation via API and TypeScrip
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.