Building Microservices
Sam Newman defines microservices as independently deployable services modeled around a business domain. The book is not about making services small — it is about making change safe, fast, and contained.
Five key properties of microservices:
---
The Monolith Is Not the Enemy
Newman distinguishes three monolith types:
Single-process monolith — all code runs in one process. The most common starting point.
Modular monolith — one process, but with explicit internal module boundaries. High cohesion inside modules, low coupling between them. Often the right architecture before microservices.
Distributed monolith — multiple services but tightly coupled through shared databases or synchronous call chains. Gets the worst of both worlds: deployment complexity without autonomy.
> Start with a modular monolith. Decompose into microservices only when the cost of coordination inside the monolith exceeds the cost of distribution.
---
Modeling Services Around Domains
Bounded Contexts (from DDD)
Each microservice should own one bounded context — a clear boundary within which a domain model is internally consistent.
Order Context: "Customer" means billing address, payment method
Marketing Context: "Customer" means email preferences, campaign history
→ Two services, two models, two meanings of "Customer"
→ They communicate through well-defined contracts, not shared tables
Finding Service Boundaries
Use these signals:
Volatility matters: separate stable code from frequently-changing code into different services.
---
Splitting the Monolith
The Strangler Fig Pattern
Incrementally migrate from a monolith by routing traffic to new services, one capability at a time.
Phase 1: All traffic → Monolith
Phase 2: Traffic → Proxy
├── /orders → OrderService (new)
└── /* → Monolith
Phase 3: Traffic → Proxy
├── /orders → OrderService
├── /users → UserService (new)
└── /* → Monolith (shrinking)
Phase N: Monolith gone
Never do a big-bang rewrite. Extract one capability at a time, validate, then continue.
Database Decomposition
Shared databases are the most dangerous form of coupling — they bind services to each other's schema.
Steps to split:
1. Identify which code owns which tables
2. Add a seam: access shared tables only through an API owned by one service
3. Separate the schemas in the same database (different schemas, same server)
4. Move each schema to its own database
Patterns for managing data that multiple services need:
---
Communication Styles
The most consequential decision in microservice design: how do services talk to each other?
Synchronous vs Asynchronous
| Synchronous | Asynchronous | |
|---|---|---|
| Caller waits? | Yes | No |
| Coupling | Temporal (both must be up) | Loose |
| Complexity | Low | High |
| Failure model | Caller fails if callee is down | Caller continues; message queued |
Request–Response (Synchronous)
REST over HTTP — simple, ubiquitous, human-readable, good for public APIs.
POST /orders
→ 201 Created { "order_id": "o123" }
GET /orders/o123
→ 200 OK { "status": "processing" }
gRPC — binary protocol (Protocol Buffers), strongly typed, fast. Best for internal service-to-service calls.
service OrderService {
rpc CreateOrder (CreateOrderRequest) returns (OrderResponse);
rpc GetOrder (GetOrderRequest) returns (OrderResponse);
}
Event-Driven (Asynchronous)
Services publish events to a broker; consumers subscribe independently.
OrderService → publishes "OrderPlaced" → Message Broker (Kafka)
├── InventoryService (reserves stock)
├── BillingService (charges card)
└── NotificationService (sends email)
Events decouple producers from consumers — OrderService does not know who reacts to OrderPlaced.
Event types:
---
Sagas: Managing Distributed Transactions
Microservices cannot use database transactions across service boundaries. Sagas coordinate multi-step workflows with compensating actions on failure.
Choreography (Event-Based)
Each service reacts to events and emits its own events. No central coordinator.
OrderService → publishes OrderPlaced
InventoryService → reserves stock → publishes StockReserved
PaymentService → charges card → publishes PaymentTaken
OrderService → publishes OrderConfirmed
On failure:
PaymentService → publishes PaymentFailed
InventoryService → releases stock (compensating action)
OrderService → publishes OrderCancelled
Orchestration (Explicit Coordinator)
A saga orchestrator tells each service what to do, handles failures explicitly.
OrderSaga (orchestrator):
1. Call InventoryService.Reserve()
2. Call PaymentService.Charge()
3. If PaymentService fails → Call InventoryService.Release()
4. Emit OrderConfirmed or OrderFailed
> Newman recommends starting with choreography for simple workflows and using orchestration when the saga logic grows complex.
---
Build and Deployment
One Artifact Per Service
Each service has its own repository (or module) and its own CI pipeline. A commit to OrderService should only trigger a build and deploy of OrderService.
commit → CI: test + lint → artifact (Docker image) → push to registry → deploy
Shared CI pipelines that build everything together defeat the purpose of independent deployability.
Container and Kubernetes Basics
Each microservice runs in its own container.
FROM python:3.12-slim
COPY . /app
RUN pip install -r requirements.txt
CMD ["uvicorn", "main:app", "--host", "0.0.0.0"]
Kubernetes manages containers at scale:
Deployment → defines desired state (3 replicas of OrderService)
Service → stable network endpoint (load balances across replicas)
Ingress → routes external traffic to services
ConfigMap → externalized configuration
Secret → sensitive credentials
GitOps
Treat infrastructure as code in Git. The desired state of the cluster is declared in Git; an operator (ArgoCD, Flux) reconciles actual state to match.
Developer merges PR → Git repo updated → ArgoCD detects diff → applies to cluster
---
Testing Microservices
Testing Pyramid
▲ Few
│ End-to-End tests (slow, brittle, expensive)
│ Integration tests (medium speed, medium cost)
│ Unit tests (fast, cheap, many)
▼ Many
For microservices, end-to-end tests across many services are extremely fragile. Invest in unit and integration tests per service. Use contract tests to verify integration points.
Consumer-Driven Contract Testing
Each consumer service defines what it expects from a producer. The producer runs these contracts as tests.
OrderService (consumer) defines:
"I expect GET /users/{id} to return { id, name, email }"
UserService (producer) runs this contract in its CI pipeline:
→ confirms its API still satisfies OrderService's expectations
Tools: Pact, Spring Cloud Contract
This eliminates the need for a shared integration test environment for most scenarios.
---
Observability
In a distributed system, you cannot attach a debugger. Observability is how you understand what the system is doing.
Three Pillars
Logs — timestamped records of discrete events.
Best practices:
- Structured logs (JSON), not plain text
- Consistent fields: timestamp, service, trace_id, level, message
- Aggregate with a log platform (ELK, Loki + Grafana)
Metrics — numeric measurements over time.
Key metrics per service:
- Request rate (requests/sec)
- Error rate (errors/sec or %)
- Latency (p50, p95, p99)
- Saturation (CPU, memory, queue depth)
USE method (for resources): Utilization, Saturation, Errors
RED method (for services): Rate, Errors, Duration
Distributed Tracing — follow a request across multiple services.
User request → TraceID: abc123
OrderService [abc123, span: 1] 12ms
→ UserService [abc123, span: 2] 3ms
→ PaymentService [abc123, span: 3] 45ms ← bottleneck visible
Tools: OpenTelemetry (standard), Jaeger, Zipkin, Tempo
Correlation IDs
Propagate a unique ID through all calls in a request chain. Log it in every service. Allows reconstructing the full picture from logs when tracing is unavailable.
# Middleware: attach incoming trace ID or generate a new one
trace_id = request.headers.get("X-Trace-Id") or str(uuid4())
logger.info("handling request", extra={"trace_id": trace_id})
---
Security
Zero Trust
Assume the network is compromised. Verify every request explicitly.
Traditional: trust everything inside the network perimeter
Zero trust: authenticate and authorize every service-to-service call
mTLS (Mutual TLS)
Both sides of a connection present certificates. Guarantees identity of caller and callee.
OrderService ←→ PaymentService
OS presents cert → PS verifies
PS presents cert → OS verifies
→ Even if network is compromised, caller identity is guaranteed
Service meshes (Istio, Linkerd) handle mTLS automatically — services don't need to implement it.
JWT for User Identity
Pass user identity between services via signed tokens.
User authenticates → Identity Service issues JWT
JWT included in all downstream calls
Each service validates the JWT signature independently (no roundtrip)
Do not re-validate credentials at every service — validate the token, trust the claims.
---
Resiliency Patterns
Distributed systems fail in partial ways. Design each service to degrade gracefully.
Timeouts
Always set timeouts on outbound calls. A call that never returns blocks a thread indefinitely.
response = requests.get(url, timeout=2.0) # fail fast after 2 seconds
Retries with Exponential Backoff
Retry transient failures, but space retries out to avoid hammering a degraded service.
for attempt in range(3):
try:
return call_service()
except TransientError:
sleep(0.5 * 2 ** attempt) # 0.5s, 1s, 2s
raise MaxRetriesExceeded()
Add jitter to prevent retry storms when many callers fail simultaneously.
Circuit Breaker
Stops calling a failing service, giving it time to recover.
CLOSED → calls pass through normally
→ failure rate exceeds threshold (e.g. 50% in 10 sec)
OPEN → calls fail immediately (no attempt made)
→ after timeout, one test call allowed
HALF-OPEN → if test succeeds → CLOSED; if fails → OPEN again
Libraries: resilience4j (Java), pybreaker (Python), opossum (Node.js)
Bulkhead
Isolate failures so they don't consume all shared resources.
Without bulkhead:
PaymentService slow → fills thread pool → all requests starve
With bulkhead:
PaymentService has its own thread pool (10 threads)
Other services have separate pools
PaymentService saturation only blocks payment calls
Fallback
Define what to return when a call fails — a default, cached response, or degraded result.
ProductService fails during checkout
→ Fallback: return product name from order history cache
→ User can still complete checkout; images/descriptions missing
---
Scaling
Four Axes of Scaling
Vertical scaling — bigger machine. Simple, limited, expensive.
Horizontal scaling — more instances behind a load balancer. Requires stateless services.
Data partitioning — split data across shards so each instance owns a subset.
Functional decomposition — extract high-load capabilities into their own service.
System is slow → profile → 80% of load is Search
→ Extract SearchService → scale it independently
→ Rest of system unaffected
Stateless Services
Services that hold no in-memory state can be scaled horizontally by adding instances.
Session state in memory → breaks horizontal scaling (requests must hit same instance)
Session state in Redis → any instance can serve any user
Caching at the Service Level
Each service owns its own cache. Never share a cache between services.
UserService: cache user profiles in Redis (TTL: 5 min)
OrderService: cache order summaries in Redis (TTL: 1 min)
→ Independent invalidation
→ No cross-service cache dependency
---
Key Takeaways
| Principle | Rule |
|---|---|
| Independent deployability | The only non-negotiable property of a microservice |
| Domain boundaries | Model on business capability, not technical layers |
| Own your data | No shared databases — ever |
| Strangler fig | Migrate incrementally; never big-bang rewrite |
| Prefer async | Event-driven communication reduces temporal coupling |
| Sagas over transactions | Compensating actions replace distributed ACID |
| Contract tests | Replace fragile end-to-end tests for integration verification |
| Observe everything | Logs + metrics + tracing are non-negotiable |
| Zero trust | Authenticate every call; mTLS at the transport layer |
| Fail gracefully | Timeouts, retries, circuit breakers, bulkheads on every outbound call |