Amazon Elastic Container Service (ECS) is AWS’s fully managed container orchestration service that enables you to deploy, manage, and scale containerized applications. ECS integrates deeply with the AWS ecosystem, providing native support for GPU-accelerated workloads, Auto Scaling, Application Load Balancers, and CloudWatch monitoring.
Deploy Fireworks inference entirely within your own Virtual Private Cloud with VPC-native architecture, internal-only API endpoints, and data that never leaves your AWS environment. Meet strict regulatory requirements for healthcare, financial services, and government workloads while leveraging existing AWS Enterprise Discount Programs and reserved instances.
Access Fireworks’ full optimization stack including FireOptimizer, adaptive caching, and speculative decoding. Deploy on the latest GPU instance types (H100, H200, B200, etc) with automatic scaling to handle high throughput / low-latency workloads within the same VPC.