Faysal Ahmed
Chapter 6

Cloud and Infrastructure Architecture

Cloud service models, hybrid strategies, cost governance, and infrastructure-as-code.

Cloud Service Models

ModelYou ManageProvider ManagesExample
SaaSConfig onlyEverything elseSalesforce, GitHub
PaaSApps & dataRuntime, OS, hardwareHeroku, App Engine
FaaSFunctionsScaling, runtime, infraLambda, Cloud Functions
IaaSApps, data, runtime, OSVirtualisation, hardwareEC2, GCE
Table 6.1 — Cloud service models ranked by level of abstraction (highest to lowest).
Guideline

Prefer higher-level services (PaaS > IaaS) when they fit your constraints. They shift operational burden to the provider. Use IaaS only when you need fine-grained control or the higher-level service doesn't meet compliance or performance requirements.

Deployment Strategies

StrategyAvailabilityLatencyCostComplexity
Single regionMediumLow (local)LowLow
Active-PassiveHighLow (failover ~min)MediumMedium
Active-ActiveVery highLow (global)HighHigh
Table 6.2 — Deployment strategy comparison across key dimensions.

Infrastructure as Code (IaC)

Treat infrastructure the same way you treat application code:

  • Terraform / OpenTofu — declarative, stateful, multi-cloud
  • Pulumi — infrastructure in general-purpose languages (TypeScript, Python, Go)
  • CloudFormation / CDK — AWS-native
  • Ansible — configuration management, procedural

IaC Best Practices

PracticeWhy
Store state remotely (S3, Terraform Cloud)Prevents state loss and enables team collaboration
Review IaC changes in pull requestsCatches misconfigurations before deployment
Use modulesAvoids duplication and enforces standards
Tag all resourcesEnables cost tracking by team, project, environment

Cost Governance

Cloud costs spiral without governance. Establish:

PracticeImpactEffort
Budgets and alertsPrevents bill shockLow
Resource taggingEnables cost attributionLow
Right-sizing instancesReduces waste 20-40%Medium
Auto-scalingMatches capacity to demandMedium
Reserved / savings plans30-60% discount vs on-demandLow
Table 6.3 — Cost governance practices ranked by impact.

Designing for Resilience

         ┌──────────┐
         │  Load    │
         │ Balancer │
         └────┬─────┘
              │
        ┌─────┼─────┐
        │     │     │
   ┌────▼──┐ ┌▼───┐ ┌▼────┐
   │ App   │ │App │ │ App │
   │ Inst.1│ │Ins.│ │Ins.3│
   └────┬──┘ └────┘ └─────┘
        │
   ┌────▼────┐
   │Circuit  │
   │Breaker  │──► Downstream Service
   └─────────┘
Figure 6.1 — Basic resilient architecture with load balancer, multiple instances, and circuit breaker.

The goal is not to prevent all failures — it’s to limit blast radius and recover automatically:

  • Load balancers — distribute traffic, health checks
  • Auto-scaling groups — replace failed instances
  • Circuit breakers — fail fast when downstream is down
  • Retries with backoff — handle transient failures
  • Bulkheads — isolate failure to one component
Remember

Resilience is not just about infrastructure. An architecture where a single database failure takes down the entire system is not resilient — regardless of how many app instances you have running.


Next: Chapter 7 — Security and Data Architecture