Home / Blog / DevOps
DevOps

Deploying Microservices to AWS: API Gateway, Service Discovery, and Security Groups in Practice

A practical guide to deploying 5 microservices on AWS ECS Fargate — covering the real problems you'll face, how service discovery works, and how to lock down security groups so services only talk to who they're supposed to.

Yudi Nugraha
May 4, 2026
12 min read

Running microservices on your local machine is straightforward — each service starts on a different port, and they talk to each other over localhost. The moment you move to AWS, everything changes.

Services are spread across containers, availability zones, and private subnets. IP addresses are ephemeral. A container that crashes and restarts gets a new IP. And every network path that isn't explicitly allowed is blocked by default.

This post works through a concrete scenario: 5 microservices deployed on AWS ECS Fargate, one of which acts as the API Gateway. We'll cover how service discovery solves the ephemeral IP problem, and how security groups enforce the communication rules that keep your infrastructure from becoming an open network.

---

The Problem

Suppose you've built five services:

ServiceRole
api-gatewayPublic entry point. Routes requests to internal services.
user-serviceHandles authentication and user profiles.
product-serviceManages product catalogue and stock.
order-serviceCreates and tracks orders.
notification-serviceSends emails and push notifications.
On localhost, order-service calls user-service at http://localhost:3001. In AWS, user-service runs inside a private subnet on a container whose IP address changes every time it restarts. There is no localhost.

Three questions immediately surface:

  • How does order-service know where user-service is? (Service Discovery)
  • Who is allowed to call whom? (Security Groups)
  • How does external traffic reach the system without exposing every service directly? (API Gateway)
  • These aren't separate concerns — they are deeply connected. Getting one wrong breaks the others.

    ---

    Fundamental Concepts

    The API Gateway Pattern

    An API Gateway is the single entry point for all external traffic. Clients never call user-service or order-service directly. They call the gateway, and the gateway routes the request to the right internal service.

    This pattern gives you one place to handle:

  • TLS termination — HTTPS handled at the edge, HTTP inside the private network
  • Authentication — validate JWT tokens before forwarding requests
  • Rate limiting — protect backends from abuse
  • Request routing/api/users/** goes to user-service, /api/orders/** goes to order-service
  • In our setup, api-gateway is itself a microservice — a Node.js (or Nginx, or Kong) container that proxies traffic. It sits behind an Application Load Balancer (ALB), which is what the internet actually talks to.

    Internet → ALB (public) → api-gateway (private) → internal services
    

    Internal services are never reachable from the internet.

    Service Discovery

    Service discovery is the mechanism that answers: "Where is user-service right now?"

    Without it, you'd hard-code IP addresses. That breaks the moment a container restarts.

    There are two approaches:

    Client-side discovery — the calling service queries a registry directly, gets an IP, and makes the call itself. The service registry (e.g., Consul, Eureka) must be maintained separately.

    Server-side discovery — the calling service calls a known DNS name (user-service.internal). A load balancer or DNS resolver handles routing to a healthy instance. The caller doesn't need to know the registry exists.

    On AWS, server-side discovery is the practical choice. AWS Cloud Map provides a private DNS namespace. ECS automatically registers containers into Cloud Map when they start and deregisters them when they stop. A service calls http://user-service.internal:3001 and DNS resolves it to a healthy container's IP — always up to date.

    Security Groups

    A Security Group is a stateful firewall attached to a network interface (in ECS Fargate, each task gets its own ENI). Rules are evaluated on every connection.

    Key properties:

  • Stateful — if outbound traffic is allowed, the response is automatically allowed back in, regardless of inbound rules.
  • Default deny — anything not explicitly allowed is blocked.
  • Reference other security groups — instead of specifying a CIDR range, you can say "allow traffic from the security group attached to api-gateway tasks." This is the correct approach for internal service-to-service communication, because it remains accurate even as IP addresses change.
  • Best practice: one security group per service. Each service gets its own security group. Inbound rules reference the security group of the caller, not a CIDR block. This creates a machine-readable map of your intended communication graph.

    ---

    Architecture Overview

                            ┌─────────────────────────────────────────┐
                            │              AWS VPC                     │
                            │                                          │
      Internet              │  Public Subnet                           │
      ─────────►  ALB ─────────► api-gateway (SG: sg-gateway)         │
                            │         │                                │
                            │         │ Private Subnet                 │
                            │         ├──► user-service (SG: sg-users) │
                            │         ├──► product-service (SG: sg-products)│
                            │         ├──► order-service (SG: sg-orders)│
                            │         └──► notification-service (SG: sg-notifications)│
                            │                                          │
                            └─────────────────────────────────────────┘
    

    The ALB lives in a public subnet. All ECS tasks run in private subnets with no direct internet access. Communication between services happens over private DNS names provided by AWS Cloud Map.

    ---

    Practical Implementation

    Step 1: Create the VPC and Subnets

    Use an existing VPC or create a dedicated one. At minimum, you need:

  • 2 public subnets (for the ALB — required for high availability)
  • 2 private subnets (for ECS tasks)
  • A NAT Gateway in a public subnet (so private tasks can pull images and make outbound calls)
  • # Using AWS CLI — create a VPC
    aws ec2 create-vpc --cidr-block 10.0.0.0/16 --tag-specifications \
      'ResourceType=vpc,Tags=[{Key=Name,Value=microservices-vpc}]'
    
    # Public subnets (replace with your AZs and VPC ID)
    aws ec2 create-subnet --vpc-id vpc-XXXXX \
      --cidr-block 10.0.1.0/24 --availability-zone us-east-1a \
      --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=public-1a}]'
    
    aws ec2 create-subnet --vpc-id vpc-XXXXX \
      --cidr-block 10.0.2.0/24 --availability-zone us-east-1b \
      --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=public-1b}]'
    
    # Private subnets
    aws ec2 create-subnet --vpc-id vpc-XXXXX \
      --cidr-block 10.0.11.0/24 --availability-zone us-east-1a \
      --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=private-1a}]'
    
    aws ec2 create-subnet --vpc-id vpc-XXXXX \
      --cidr-block 10.0.12.0/24 --availability-zone us-east-1b \
      --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=private-1b}]'
    

    Step 2: Create the ECS Cluster

    aws ecs create-cluster \
      --cluster-name microservices-cluster \
      --capacity-providers FARGATE \
      --default-capacity-provider-strategy capacityProvider=FARGATE,weight=1
    

    Step 3: Set Up AWS Cloud Map for Service Discovery

    Create a private DNS namespace. Every service will register itself here automatically.

    aws servicediscovery create-private-dns-namespace \
      --name internal \
      --vpc vpc-XXXXX \
      --description "Private namespace for microservices"
    

    This creates a namespace internal. Services will be reachable at <service-name>.internal.

    Create a service discovery entry for each microservice:

    # user-service
    aws servicediscovery create-service \
      --name user-service \
      --dns-config "NamespaceId=ns-XXXXX,DnsRecords=[{Type=A,TTL=10}]" \
      --health-check-custom-config FailureThreshold=1
    
    # product-service
    aws servicediscovery create-service \
      --name product-service \
      --dns-config "NamespaceId=ns-XXXXX,DnsRecords=[{Type=A,TTL=10}]" \
      --health-check-custom-config FailureThreshold=1
    
    # order-service
    aws servicediscovery create-service \
      --name order-service \
      --dns-config "NamespaceId=ns-XXXXX,DnsRecords=[{Type=A,TTL=10}]" \
      --health-check-custom-config FailureThreshold=1
    
    # notification-service
    aws servicediscovery create-service \
      --name notification-service \
      --dns-config "NamespaceId=ns-XXXXX,DnsRecords=[{Type=A,TTL=10}]" \
      --health-check-custom-config FailureThreshold=1
    

    With this in place, order-service calls http://user-service.internal:3001 and DNS resolves to a healthy task. No hard-coded IPs. No custom discovery client in your application code.

    Step 4: Create Security Groups

    This step defines the communication rules for the entire system. Create one security group per service.

    VPC_ID=vpc-XXXXX
    
    # ALB security group — accepts HTTPS from the internet
    aws ec2 create-security-group \
      --group-name sg-alb \
      --description "ALB: accept HTTPS from internet" \
      --vpc-id $VPC_ID
    
    # api-gateway security group
    aws ec2 create-security-group \
      --group-name sg-gateway \
      --description "api-gateway service" \
      --vpc-id $VPC_ID
    
    # user-service security group
    aws ec2 create-security-group \
      --group-name sg-users \
      --description "user-service" \
      --vpc-id $VPC_ID
    
    # product-service security group
    aws ec2 create-security-group \
      --group-name sg-products \
      --description "product-service" \
      --vpc-id $VPC_ID
    
    # order-service security group
    aws ec2 create-security-group \
      --group-name sg-orders \
      --description "order-service" \
      --vpc-id $VPC_ID
    
    # notification-service security group
    aws ec2 create-security-group \
      --group-name sg-notifications \
      --description "notification-service" \
      --vpc-id $VPC_ID
    

    Now define the inbound rules. The key principle: each service only allows inbound traffic from the specific security group of its caller.

    # ALB: accept port 443 from the internet
    aws ec2 authorize-security-group-ingress \
      --group-id sg-alb-ID \
      --protocol tcp --port 443 --cidr 0.0.0.0/0
    
    # api-gateway: accept traffic only from the ALB
    aws ec2 authorize-security-group-ingress \
      --group-id sg-gateway-ID \
      --protocol tcp --port 3000 \
      --source-group sg-alb-ID
    
    # user-service: accept traffic only from api-gateway
    aws ec2 authorize-security-group-ingress \
      --group-id sg-users-ID \
      --protocol tcp --port 3001 \
      --source-group sg-gateway-ID
    
    # product-service: accept traffic from api-gateway and order-service
    # (order-service checks stock before placing an order)
    aws ec2 authorize-security-group-ingress \
      --group-id sg-products-ID \
      --protocol tcp --port 3002 \
      --source-group sg-gateway-ID
    
    aws ec2 authorize-security-group-ingress \
      --group-id sg-products-ID \
      --protocol tcp --port 3002 \
      --source-group sg-orders-ID
    
    # order-service: accept traffic only from api-gateway
    aws ec2 authorize-security-group-ingress \
      --group-id sg-orders-ID \
      --protocol tcp --port 3003 \
      --source-group sg-gateway-ID
    
    # notification-service: accept traffic only from order-service
    # (only orders trigger notifications in this setup)
    aws ec2 authorize-security-group-ingress \
      --group-id sg-notifications-ID \
      --protocol tcp --port 3004 \
      --source-group sg-orders-ID
    

    The resulting communication map looks like this:

    ALB → api-gateway → user-service
                     → product-service ← order-service
                     → order-service   → notification-service
    

    notification-service cannot be reached from api-gateway directly — not because of application logic, but because the security group rule does not exist. The network itself enforces the architecture.

    Step 5: Create ECS Task Definitions

    Each service needs a task definition. Here's the one for api-gateway:

    {
      "family": "api-gateway",
      "networkMode": "awsvpc",
      "requiresCompatibilities": ["FARGATE"],
      "cpu": "256",
      "memory": "512",
      "executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
      "containerDefinitions": [
        {
          "name": "api-gateway",
          "image": "ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/api-gateway:latest",
          "portMappings": [
            {
              "containerPort": 3000,
              "protocol": "tcp"
            }
          ],
          "environment": [
            { "name": "USER_SERVICE_URL", "value": "http://user-service.internal:3001" },
            { "name": "PRODUCT_SERVICE_URL", "value": "http://product-service.internal:3002" },
            { "name": "ORDER_SERVICE_URL", "value": "http://order-service.internal:3003" }
          ],
          "logConfiguration": {
            "logDriver": "awslogs",
            "options": {
              "awslogs-group": "/ecs/api-gateway",
              "awslogs-region": "us-east-1",
              "awslogs-stream-prefix": "ecs"
            }
          }
        }
      ]
    }
    

    The service URLs use the Cloud Map DNS names. These are static configuration — the actual IP resolution happens at request time, always pointing to a healthy container.

    Create the same structure for each service, adjusting the image, port, and environment variables accordingly.

    aws ecs register-task-definition --cli-input-json file://task-def-api-gateway.json
    aws ecs register-task-definition --cli-input-json file://task-def-user-service.json
    aws ecs register-task-definition --cli-input-json file://task-def-product-service.json
    aws ecs register-task-definition --cli-input-json file://task-def-order-service.json
    aws ecs register-task-definition --cli-input-json file://task-def-notification-service.json
    

    Step 6: Create ECS Services

    Each ECS service links a task definition to a security group, subnets, and (for api-gateway) the Cloud Map service discovery registry.

    # Internal services — no ALB, just Cloud Map registration
    aws ecs create-service \
      --cluster microservices-cluster \
      --service-name user-service \
      --task-definition user-service:1 \
      --desired-count 2 \
      --launch-type FARGATE \
      --network-configuration "awsvpcConfiguration={
        subnets=[subnet-private-1a,subnet-private-1b],
        securityGroups=[sg-users-ID],
        assignPublicIp=DISABLED
      }" \
      --service-registries "registryArn=arn:aws:servicediscovery:us-east-1:ACCOUNT:service/srv-users-ID"
    

    Repeat for product-service, order-service, and notification-service, substituting their respective security group IDs and Cloud Map service ARNs.

    For api-gateway, attach it to the ALB instead:

    # Create target group for the ALB
    aws elbv2 create-target-group \
      --name tg-api-gateway \
      --protocol HTTP \
      --port 3000 \
      --vpc-id $VPC_ID \
      --target-type ip \
      --health-check-path /health
    
    # Create the ALB
    aws elbv2 create-load-balancer \
      --name alb-microservices \
      --subnets subnet-public-1a subnet-public-1b \
      --security-groups sg-alb-ID \
      --scheme internet-facing \
      --type application
    
    # Create HTTPS listener (assumes ACM certificate already exists)
    aws elbv2 create-listener \
      --load-balancer-arn arn:aws:elasticloadbalancing:...:loadbalancer/app/alb-microservices/... \
      --protocol HTTPS \
      --port 443 \
      --certificates CertificateArn=arn:aws:acm:us-east-1:ACCOUNT:certificate/CERT-ID \
      --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:...:targetgroup/tg-api-gateway/...
    
    # Create the api-gateway ECS service
    aws ecs create-service \
      --cluster microservices-cluster \
      --service-name api-gateway \
      --task-definition api-gateway:1 \
      --desired-count 2 \
      --launch-type FARGATE \
      --network-configuration "awsvpcConfiguration={
        subnets=[subnet-private-1a,subnet-private-1b],
        securityGroups=[sg-gateway-ID],
        assignPublicIp=DISABLED
      }" \
      --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:...:targetgroup/tg-api-gateway/...,containerName=api-gateway,containerPort=3000"
    

    Note that api-gateway still runs in a private subnet — the ALB is the only resource in the public subnet. Traffic from the internet hits the ALB, which forwards it to api-gateway over the private network. The gateway tasks never have a public IP.

    ---

    Verifying the Setup

    Check service discovery registration

    After each ECS service starts, its tasks register with Cloud Map automatically. Verify:

    aws servicediscovery list-instances \
      --service-id srv-users-ID
    

    You should see one entry per running task, each with a private IP address.

    Test internal DNS resolution

    Connect to a container in the cluster (using ECS Exec) and resolve the internal DNS name:

    aws ecs execute-command \
      --cluster microservices-cluster \
      --task TASK-ID \
      --container api-gateway \
      --interactive \
      --command "nslookup user-service.internal"
    

    The response should list the private IPs of your user-service tasks.

    Confirm security group enforcement

    Try calling notification-service directly from the ALB — it should time out because no rule exists on sg-notifications allowing traffic from sg-alb. The block happens at the network layer, before any application code runs.

    ---

    Common Mistakes

    Allowing 0.0.0.0/0 on internal services. It's tempting when debugging — but once you open a security group to the world, it's easy to forget to close it. Internal services should never accept traffic from the internet.

    Using CIDR ranges instead of security group references for internal rules. CIDR ranges break as soon as IP addresses change. Reference security groups by ID — they stay accurate regardless of which IPs your containers happen to have.

    One security group for all internal services. This is convenient to set up but destroys the communication map. notification-service would accept calls from api-gateway, even though it shouldn't. Separate security groups make your intended architecture visible and enforceable.

    Forgetting health check endpoints. Cloud Map uses health checks to deregister unhealthy tasks. If your service doesn't expose a /health endpoint that returns 200, Cloud Map will deregister healthy tasks or leave unhealthy ones in the registry.

    ---

    Summary

    LayerWhat it solves
    ALBAccepts public HTTPS traffic, terminates TLS, routes to api-gateway
    api-gateway (ECS service)Single entry point, routes internal requests by path
    AWS Cloud MapResolves service-name.internal to live container IPs — no hard-coded addresses
    Security Groups (per service)Enforces who can call whom at the network level, independent of application logic
    Private subnetsInternal services are unreachable from the internet by default
    The setup takes more effort than a single server, but the result is a system where the architecture is enforced by infrastructure — not convention. Adding a new service means creating a task definition, a Cloud Map entry, and a security group with explicit rules about who may call it. Everything else stays the same.

    Tags

    AWSMicroservicesECSFargateService DiscoverySecurity GroupsDevOpsCloud
    Y

    Yudi Nugraha

    Software Engineer | Builder

    More Articles

    Explore more articles on similar topics

    View All Articles