Cloud Architecture and AWS Best Practices: Building Scalable Infrastructureh1

Hello! I’m Ahmet Zeybek, a full stack developer with extensive experience in cloud architecture and AWS infrastructure. Moving to the cloud has transformed how we build and scale applications, offering unprecedented flexibility and power. In this comprehensive guide, I’ll share the patterns and practices that have helped me design cost-effective, scalable, and reliable cloud architectures.

Cloud Architecture Fundamentalsh2

1. Well-Architected Frameworkh3

AWS’s five pillars of well-architected design:

Operational Excellenceh4

Automate everything: Infrastructure as Code (IaC)
Monitor and log: Comprehensive observability
Incident response: Runbooks and automation

Securityh4

Defense in depth: Multiple security layers
Least privilege: Minimal required permissions
Encryption everywhere: Data at rest and in transit

Reliabilityh4

Fault tolerance: Design for failure
Auto scaling: Handle traffic spikes
Disaster recovery: Multi-region backup

Performance Efficiencyh4

Right-sized resources: Don’t over-provision
Caching strategies: Reduce database load
CDN usage: Global content delivery

Cost Optimizationh4

Demand-based scaling: Pay only for what you use
Resource optimization: Right-size instances
Storage tiering: Use appropriate storage classes

Infrastructure as Codeh2

2. AWS CDK for Infrastructureh3

Modern infrastructure provisioning:

import * as cdk from 'aws-cdk-lib'
import * as ec2 from 'aws-cdk-lib/aws-ec2'
import * as rds from 'aws-cdk-lib/aws-rds'
import * as lambda from 'aws-cdk-lib/aws-lambda'
import * as apigateway from 'aws-cdk-lib/aws-apigateway'
import * as cloudfront from 'aws-cdk-lib/aws-cloudfront'
import * as s3 from 'aws-cdk-lib/aws-s3'

export class MyAppStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props)

    // VPC with proper networking
    const vpc = new ec2.Vpc(this, 'MyAppVPC', {
      maxAzs: 3,
      natGateways: 1,
      subnetConfiguration: [
        {
          cidrMask: 24,
          name: 'Public',
          subnetType: ec2.SubnetType.PUBLIC,
        },
        {
          cidrMask: 24,
          name: 'Private',
          subnetType: ec2.SubnetType.PRIVATE_WITH_NAT,
        },
        {
          cidrMask: 24,
          name: 'Database',
          subnetType: ec2.SubnetType.PRIVATE_ISOLATED,
        },
      ],
    })

    // RDS PostgreSQL database
    const database = new rds.DatabaseInstance(this, 'MyAppDB', {
      engine: rds.DatabaseInstanceEngine.postgres({ version: rds.PostgresEngineVersion.VER_15 }),
      instanceType: ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.MICRO),
      vpc,
      vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_ISOLATED },
      databaseName: 'myapp',
      allocatedStorage: 20,
      maxAllocatedStorage: 100,
      storageEncrypted: true,
      backupRetention: cdk.Duration.days(7),
      deletionProtection: true,
      monitoringInterval: cdk.Duration.seconds(60),
      enablePerformanceInsights: true,
    })

    // Lambda functions
    const apiHandler = new lambda.Function(this, 'ApiHandler', {
      runtime: lambda.Runtime.NODEJS_20_X,
      code: lambda.Code.fromAsset('../lambda/dist'),
      handler: 'index.handler',
      timeout: cdk.Duration.seconds(30),
      memorySize: 512,
      environment: {
        DATABASE_URL: database.secret?.secretValueFromJson('connectionString').unsafeUnwrap()!,
        NODE_ENV: 'production',
      },
      vpc,
      vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_NAT },
      securityGroups: [createLambdaSecurityGroup(this, vpc)],
    })

    // API Gateway
    const api = new apigateway.RestApi(this, 'MyAppAPI', {
      restApiName: 'MyApp API',
      description: 'API for MyApp',
      deployOptions: {
        stageName: 'prod',
        dataTraceEnabled: true,
        loggingLevel: apigateway.MethodLoggingLevel.INFO,
        metricsEnabled: true,
      },
    })

    // CloudFront distribution
    const distribution = new cloudfront.Distribution(this, 'MyAppCDN', {
      defaultBehavior: {
        origin: new origins.RestApiOrigin(api),
        viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
        compress: true,
        cachePolicy: cloudfront.CachePolicy.CACHING_OPTIMIZED,
      },
      comment: 'CDN for MyApp',
      enabled: true,
      httpVersion: cloudfront.HttpVersion.HTTP2_AND_3,
      priceClass: cloudfront.PriceClass.PRICE_CLASS_ALL,
    })

    // S3 bucket for static assets
    const assetsBucket = new s3.Bucket(this, 'AssetsBucket', {
      bucketName: `myapp-assets-${this.account}-${this.region}`,
      publicReadAccess: false,
      blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
      encryption: s3.BucketEncryption.S3_MANAGED,
      versioned: true,
      lifecycleRules: [
        {
          id: 'Transition to IA',
          enabled: true,
          transitions: [
            {
              storageClass: s3.StorageClass.INFREQUENTLY_ACCESSED,
              transitionAfter: cdk.Duration.days(30),
            },
            {
              storageClass: s3.StorageClass.GLACIER,
              transitionAfter: cdk.Duration.days(90),
            },
          ],
        },
      ],
    })

    // Outputs
    new cdk.CfnOutput(this, 'CDNURL', {
      value: `https://${distribution.distributionDomainName}`,
      description: 'CloudFront distribution URL',
    })

    new cdk.CfnOutput(this, 'DatabaseEndpoint', {
      value: database.instanceEndpoint.hostname,
      description: 'Database endpoint',
    })
  }
}

function createLambdaSecurityGroup(scope: Construct, vpc: ec2.IVpc): ec2.SecurityGroup {
  const sg = new ec2.SecurityGroup(scope, 'LambdaSG', { vpc })

  // Allow outbound to database
  sg.addEgressRule(ec2.Peer.ipv4(vpc.vpcCidrBlock), ec2.Port.tcp(5432), 'Allow database connections')

  // Allow outbound to internet (for external APIs)
  sg.addEgressRule(ec2.Peer.anyIpv4(), ec2.Port.tcp(443), 'Allow HTTPS outbound')

  return sg
}

Serverless Architectureh2

3. Event-Driven Serverless Designh3

Build applications that respond to events:

// AWS Lambda handlers
import { DynamoDBClient } from '@aws-sdk/client-dynamodb'
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3'
import { SNSClient, PublishCommand } from '@aws-sdk/client-sns'

const dynamoClient = new DynamoDBClient({})
const s3Client = new S3Client({})
const snsClient = new SNSClient({})

// Process file upload event
export const processFileUpload = async (event: S3Event) => {
  for (const record of event.Records) {
    const bucket = record.s3.bucket.name
    const key = record.s3.object.key

    try {
      // Get file metadata
      const fileData = await s3Client.send(new GetObjectCommand({ Bucket: bucket, Key: key }))

      // Process file based on type
      if (key.endsWith('.csv')) {
        await processCSVFile(bucket, key)
      } else if (key.endsWith('.json')) {
        await processJSONFile(bucket, key)
      }

      // Update processing status
      await updateProcessingStatus(key, 'completed')

      // Send notification
      await snsClient.send(
        new PublishCommand({
          TopicArn: process.env.NOTIFICATION_TOPIC_ARN,
          Message: JSON.stringify({
            type: 'FILE_PROCESSED',
            fileKey: key,
            status: 'success',
          }),
        })
      )
    } catch (error) {
      console.error('File processing error:', error)

      // Update status to failed
      await updateProcessingStatus(key, 'failed')

      // Send error notification
      await snsClient.send(
        new PublishCommand({
          TopicArn: process.env.NOTIFICATION_TOPIC_ARN,
          Message: JSON.stringify({
            type: 'FILE_PROCESSING_ERROR',
            fileKey: key,
            error: error.message,
          }),
        })
      )
    }
  }
}

// API Gateway handler
export const apiHandler = async (event: APIGatewayEvent) => {
  const { httpMethod, path, body } = event

  try {
    switch (`${httpMethod} ${path}`) {
      case 'GET /users':
        return await getUsers()

      case 'POST /users':
        return await createUser(JSON.parse(body))

      case 'GET /users/{id}':
        return await getUser(event.pathParameters?.id)

      case 'PUT /users/{id}':
        return await updateUser(event.pathParameters?.id, JSON.parse(body))

      case 'DELETE /users/{id}':
        return await deleteUser(event.pathParameters?.id)

      default:
        return {
          statusCode: 404,
          body: JSON.stringify({ error: 'Not found' }),
        }
    }
  } catch (error) {
    console.error('API error:', error)

    return {
      statusCode: 500,
      body: JSON.stringify({
        error: 'Internal server error',
        message: process.env.NODE_ENV === 'development' ? error.message : 'Something went wrong',
      }),
    }
  }
}

// Step Functions for complex workflows
export const orderProcessingWorkflow = async (event: StepFunctionEvent) => {
  const { orderId } = event

  try {
    // 1. Validate order
    await validateOrder(orderId)

    // 2. Check inventory
    const inventoryAvailable = await checkInventory(orderId)

    if (!inventoryAvailable) {
      await updateOrderStatus(orderId, 'CANCELLED')
      return { status: 'cancelled', reason: 'insufficient_inventory' }
    }

    // 3. Process payment
    const paymentResult = await processPayment(orderId)

    if (!paymentResult.success) {
      await updateOrderStatus(orderId, 'PAYMENT_FAILED')
      return { status: 'failed', reason: 'payment_failed' }
    }

    // 4. Reserve inventory
    await reserveInventory(orderId)

    // 5. Update order status
    await updateOrderStatus(orderId, 'CONFIRMED')

    // 6. Send confirmation email
    await sendOrderConfirmation(orderId)

    return { status: 'completed', orderId }
  } catch (error) {
    console.error('Order processing error:', error)

    // Compensating actions
    await updateOrderStatus(orderId, 'FAILED')
    await releaseInventory(orderId)

    throw error
  }
}

Database Architectureh2

4. Multi-Region Database Designh3

Ensure high availability and disaster recovery:

// Aurora Global Database setup
const globalDatabase = new rds.DatabaseCluster(this, 'GlobalDB', {
  engine: rds.DatabaseClusterEngine.auroraPostgres({
    version: rds.AuroraPostgresEngineVersion.VER_15,
  }),
  instances: 2,
  instanceProps: {
    instanceType: ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.LARGE),
    vpc,
    vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_ISOLATED },
  },
  storageEncrypted: true,
  backup: {
    retention: cdk.Duration.days(7),
    preferredWindow: '03:00-04:00',
  },
  monitoringInterval: cdk.Duration.seconds(60),
  enablePerformanceInsights: true,
})

// Read replicas in different regions
const readReplicaUSWest2 = new rds.ClusterInstance(this, 'ReadReplicaUSW2', {
  instanceType: ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.LARGE),
  cluster: globalDatabase,
  promotionTier: 2,
})

// ElastiCache for Redis
const redisCluster = new elasticache.CfnServerlessCache(this, 'RedisCluster', {
  engine: 'redis',
  serverlessCacheName: 'myapp-redis',
  description: 'Redis cluster for session storage and caching',
  securityGroupIds: [redisSecurityGroup.securityGroupId],
  subnetIds: vpc.privateSubnets.map((subnet) => subnet.subnetId),
  cacheUsageLimits: {
    dataStorage: {
      maximum: 10,
      unit: 'GB',
    },
    ecpuPerSecond: {
      maximum: 10000,
    },
  },
  dailySnapshotTime: '05:00',
  majorEngineVersion: '7',
})

Security Architectureh2

5. Zero-Trust Security Modelh3

Implement comprehensive security:

// IAM policies with least privilege
const lambdaExecutionRole = new iam.Role(this, 'LambdaExecutionRole', {
  assumedBy: new iam.ServicePrincipal('lambda.amazonaws.com'),
  managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AWSLambdaBasicExecutionRole')],
  inlinePolicies: {
    DatabaseAccess: new iam.PolicyDocument({
      statements: [
        new iam.PolicyStatement({
          effect: iam.Effect.ALLOW,
          actions: ['rds-db:connect'],
          resources: [database.secret?.secretArn!],
        }),
      ],
    }),
    S3Access: new iam.PolicyStatement({
      effect: iam.Effect.ALLOW,
      actions: ['s3:GetObject', 's3:PutObject'],
      resources: [`${assetsBucket.bucketArn}/*`],
    }),
  },
})

// VPC endpoints for secure access
const dynamodbEndpoint = new ec2.GatewayVpcEndpoint(this, 'DynamoDBEndpoint', {
  service: ec2.GatewayVpcEndpointAwsService.DYNAMODB,
  vpc,
  subnets: [{ subnetType: ec2.SubnetType.PRIVATE_WITH_NAT }],
})

const s3Endpoint = new ec2.GatewayVpcEndpoint(this, 'S3Endpoint', {
  service: ec2.GatewayVpcEndpointAwsService.S3,
  vpc,
  subnets: [{ subnetType: ec2.SubnetType.PRIVATE_WITH_NAT }],
})

// Security groups with specific rules
const databaseSecurityGroup = new ec2.SecurityGroup(this, 'DatabaseSG', {
  vpc,
  description: 'Security group for database',
  allowAllOutbound: false,
})

// Only allow connections from application security group
databaseSecurityGroup.addIngressRule(applicationSecurityGroup, ec2.Port.tcp(5432), 'Allow PostgreSQL connections from application')

// WAF for API Gateway
const webACL = new wafv2.CfnWebACL(this, 'MyAppWebACL', {
  name: 'MyAppWebACL',
  scope: 'REGIONAL',
  defaultAction: { block: {} },
  rules: [
    {
      name: 'RateLimit',
      priority: 1,
      action: { block: {} },
      statement: {
        rateBasedStatement: {
          limit: 1000,
          aggregateKeyType: 'IP',
        },
      },
      visibilityConfig: {
        sampledRequestsEnabled: true,
        cloudWatchMetricsEnabled: true,
        metricName: 'RateLimitRule',
      },
    },
    {
      name: 'SQLInjection',
      priority: 2,
      action: { block: {} },
      statement: {
        sqliMatchStatement: {
          fieldToMatch: { body: {} },
          textTransformations: [
            { priority: 0, type: 'LOWERCASE' },
            { priority: 1, type: 'URL_DECODE' },
          ],
        },
      },
      visibilityConfig: {
        sampledRequestsEnabled: true,
        cloudWatchMetricsEnabled: true,
        metricName: 'SQLInjectionRule',
      },
    },
  ],
  visibilityConfig: {
    sampledRequestsEnabled: true,
    cloudWatchMetricsEnabled: true,
    metricName: 'MyAppWebACL',
  },
})

Monitoring and Observabilityh2

6. Comprehensive Monitoring Setuph3

Monitor your entire infrastructure:

// CloudWatch dashboards
const dashboard = new cloudwatch.Dashboard(this, 'MyAppDashboard', {
  dashboardName: 'MyApp-Monitoring-Dashboard',
  defaultInterval: cdk.Duration.hours(24),
})

// Add widgets to dashboard
dashboard.addWidgets(
  new cloudwatch.GraphWidget({
    title: 'API Gateway Latency',
    left: [
      new cloudwatch.Metric({
        namespace: 'AWS/ApiGateway',
        metricName: 'Latency',
        dimensionsMap: { ApiName: api.restApiName },
      }),
    ],
  }),

  new cloudwatch.GraphWidget({
    title: 'Lambda Errors',
    left: [
      new cloudwatch.Metric({
        namespace: 'AWS/Lambda',
        metricName: 'Errors',
        dimensionsMap: { FunctionName: apiHandler.functionName },
      }),
    ],
  }),

  new cloudwatch.GraphWidget({
    title: 'Database Connections',
    left: [
      new cloudwatch.Metric({
        namespace: 'AWS/RDS',
        metricName: 'DatabaseConnections',
        dimensionsMap: { DBInstanceIdentifier: database.instanceIdentifier },
      }),
    ],
  })
)

// CloudWatch alarms
const highLatencyAlarm = new cloudwatch.Alarm(this, 'HighLatencyAlarm', {
  alarmName: 'MyApp-HighLatency',
  alarmDescription: 'API Gateway latency is too high',
  metric: new cloudwatch.Metric({
    namespace: 'AWS/ApiGateway',
    metricName: 'Latency',
    dimensionsMap: { ApiName: api.restApiName },
  }),
  threshold: 1000, // 1 second
  evaluationPeriods: 2,
  comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_THRESHOLD,
  treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,
})

// SNS topic for notifications
const alarmTopic = new sns.Topic(this, 'AlarmTopic', {
  topicName: 'MyApp-Alarms',
  displayName: 'MyApp Alarm Notifications',
})

// Subscribe email to alarms
alarmTopic.addSubscription(new subscriptions.EmailSubscription('alerts@myapp.com'))

// Connect alarm to topic
highLatencyAlarm.addAlarmAction(new actions.SnsAction(alarmTopic))

Cost Optimizationh2

7. Cost Optimization Strategiesh3

Reduce cloud costs while maintaining performance:

// Auto scaling configuration
const autoScalingGroup = new autoscaling.AutoScalingGroup(this, 'WebServerASG', {
  vpc,
  instanceType: ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.MICRO),
  machineImage: ec2.MachineImage.latestAmazonLinux2(),
  minCapacity: 1,
  maxCapacity: 10,
  desiredCapacity: 2,
  cooldown: cdk.Duration.minutes(5),

  // Scale based on CPU utilization
  scalingPolicies: [
    {
      scalingPolicyName: 'ScaleOut',
      scalingPolicyType: autoscaling.ScalingPolicyType.TARGET_TRACKING_SCALING,
      targetTrackingConfiguration: {
        predefinedMetricSpecification: {
          predefinedMetricType: autoscaling.PredefinedMetricType.ASGAverageCPUUtilization,
        },
        targetValue: 70,
      },
    },
  ],

  // Scheduled scaling for predictable traffic
  scheduledActions: [
    {
      scheduledActionName: 'ScaleUpForBusinessHours',
      minSize: 3,
      maxSize: 8,
      desiredCapacity: 5,
      timeZone: 'America/New_York',
      schedule: autoscaling.Schedule.cron({ hour: '9', minute: '0' }),
    },
    {
      scheduledActionName: 'ScaleDownAfterBusinessHours',
      minSize: 1,
      maxSize: 3,
      desiredCapacity: 2,
      timeZone: 'America/New_York',
      schedule: autoscaling.Schedule.cron({ hour: '18', minute: '0' }),
    },
  ],
})

// Spot instances for cost savings
const spotFleet = new ec2.CfnSpotFleet(this, 'SpotFleet', {
  spotFleetRequestConfigData: {
    iamFleetRole: fleetRole.roleArn,
    allocationStrategy: 'diversified',
    targetCapacity: 10,
    spotPrice: '0.10', // Maximum spot price
    launchSpecifications: [
      {
        instanceType: 'm5.large',
        ami: 'ami-12345678',
        keyName: 'my-key-pair',
        securityGroups: [webSecurityGroup.securityGroupId],
        subnetId: vpc.publicSubnets[0].subnetId,
        weightedCapacity: '1',
      },
    ],
  },
})

// S3 intelligent tiering
const intelligentTieringBucket = new s3.Bucket(this, 'IntelligentTieringBucket', {
  bucketName: `myapp-intelligent-${this.account}`,
  intelligentTieringConfigurations: [
    {
      id: 'EntireBucket',
      prefix: '',
      tierings: [
        {
          accessTier: s3.AccessTier.FREQUENT_ACCESS,
          days: 30,
        },
        {
          accessTier: s3.AccessTier.INFREQUENT_ACCESS,
          days: 90,
        },
        {
          accessTier: s3.AccessTier.ARCHIVE_ACCESS,
          days: 365,
        },
      ],
    },
  ],
})

// Cost and usage report
const costReport = new s3.Bucket(this, 'CostReportBucket', {
  bucketName: `myapp-cost-reports-${this.account}`,
  lifecycleRules: [
    {
      id: 'DeleteOldReports',
      enabled: true,
      expiration: cdk.Duration.days(2555), // 7 years
    },
  ],
})

// Enable cost and usage report
new cur.CfnReportDefinition(this, 'CostAndUsageReport', {
  reportName: 'MyAppCostAndUsageReport',
  timeUnit: 'DAILY',
  format: 'Parquet',
  compression: 'Parquet',
  additionalSchemaElements: ['RESOURCES'],
  s3Bucket: costReport.bucketName,
  s3Prefix: 'cost-reports',
  s3Region: this.region,
  refreshClosedReports: true,
})

Multi-Region Architectureh2

8. Global Infrastructure Designh3

Build for global scale:

// Multi-region setup
export class GlobalStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props: cdk.StackProps) {
    super(scope, id, props)

    // Primary region (us-east-1)
    const primaryRegion = new cdk.Stack(scope, 'PrimaryRegion', {
      env: { region: 'us-east-1' },
    })

    // Secondary region (eu-west-1)
    const secondaryRegion = new cdk.Stack(scope, 'SecondaryRegion', {
      env: { region: 'eu-west-1' },
    })

    // Global resources
    const globalTable = new dynamodb.Table(this, 'GlobalTable', {
      tableName: 'MyApp-GlobalTable',
      partitionKey: { name: 'pk', type: dynamodb.AttributeType.STRING },
      sortKey: { name: 'sk', type: dynamodb.AttributeType.STRING },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
      replicationRegions: ['us-east-1', 'eu-west-1', 'ap-southeast-1'],
      pointInTimeRecovery: true,
    })

    // Route 53 for global routing
    const hostedZone = new route53.HostedZone(this, 'MyAppZone', {
      zoneName: 'myapp.com',
    })

    // CloudFront with Lambda@Edge
    const distribution = new cloudfront.Distribution(this, 'GlobalCDN', {
      defaultBehavior: {
        origin: new origins.S3Origin(assetsBucket),
        edgeLambdas: [
          {
            functionVersion: edgeFunction.currentVersion,
            eventType: cloudfront.LambdaEdgeEventType.ORIGIN_REQUEST,
          },
        ],
      },
    })

    // Global health check
    const healthCheck = new route53.HealthCheck(this, 'GlobalHealthCheck', {
      fqdn: 'api.myapp.com',
      port: 443,
      type: route53.HealthCheckType.HTTPS,
      resourcePath: '/health',
      failureThreshold: 3,
      requestInterval: cdk.Duration.seconds(30),
    })
  }
}

DevOps and Automationh2

9. CI/CD Pipeline with AWSh3

Automated deployment pipeline:

name: Deploy to AWS

on:
  push:
    branches: [main]
  workflow_dispatch:

env:
  AWS_REGION: us-east-1
  NODE_ENV: production

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Run tests
        run: npm run test

      - name: Build application
        run: npm run build

      - name: Upload build artifacts
        uses: actions/upload-artifact@v4
        with:
          name: build-artifacts
          path: dist/

  deploy-infrastructure:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'

    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Setup CDK
        uses: aws-actions/setup-aws-cdk@v1

      - name: Install CDK dependencies
        run: npm ci

      - name: Deploy to AWS
        run: |
          cdk bootstrap
          cdk deploy --require-approval never

  deploy-application:
    needs: deploy-infrastructure
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Download build artifacts
        uses: actions/download-artifact@v4
        with:
          name: build-artifacts
          path: dist/

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Deploy Lambda functions
        run: |
          # Update Lambda function code
          aws lambda update-function-code \
            --function-name MyAppApiHandler \
            --s3-bucket myapp-deployment-bucket \
            --s3-key lambda-functions/api-handler.zip

          # Update API Gateway
          aws apigateway update-stage \
            --rest-api-id ${{ secrets.API_GATEWAY_ID }} \
            --stage-name prod \
            --patch-op replace \
            --patch-path deploymentId \
            --patch-value ${{ secrets.DEPLOYMENT_ID }}

  smoke-tests:
    needs: deploy-application
    runs-on: ubuntu-latest

    steps:
      - name: Run smoke tests
        run: |
          # Health check
          curl -f https://api.myapp.com/health

          # Basic API tests
          curl -f -X POST https://api.myapp.com/test \
            -H "Content-Type: application/json" \
            -d '{"test": "data"}'

Disaster Recoveryh2

10. Backup and Recovery Strategyh3

Ensure business continuity:

// Automated backup strategy
const backupPlan = new backup.BackupPlan(this, 'MyAppBackupPlan', {
  backupPlan: {
    backupPlanName: 'MyApp-BackupPlan',
    backupPlanRules: [
      {
        ruleName: 'DailyBackups',
        targetBackupVault: backupVault,
        scheduleExpression: events.Schedule.cron({ hour: '2', minute: '0' }),
        lifecycle: {
          deleteAfter: cdk.Duration.days(30),
        },
      },
      {
        ruleName: 'WeeklyBackups',
        targetBackupVault: backupVault,
        scheduleExpression: events.Schedule.cron({ weekDay: 'SUN', hour: '3', minute: '0' }),
        lifecycle: {
          deleteAfter: cdk.Duration.days(90),
        },
      },
    ],
  },
})

// Backup vault with encryption
const backupVault = new backup.BackupVault(this, 'MyAppBackupVault', {
  backupVaultName: 'MyApp-BackupVault',
  encryptionKey: kmsKey,
  accessPolicy: new iam.PolicyDocument({
    statements: [
      new iam.PolicyStatement({
        effect: iam.Effect.ALLOW,
        principals: [new iam.AccountPrincipal(this.account)],
        actions: ['backup:*'],
        resources: ['*'],
      }),
    ],
  }),
})

// Cross-region replication for S3
const replicatedBucket = new s3.Bucket(this, 'ReplicatedBucket', {
  bucketName: `myapp-replicated-${this.account}`,
  replicationRules: [
    {
      id: 'CrossRegionReplication',
      status: s3.ReplicationStatus.ENABLED,
      destination: {
        bucket: `arn:aws:s3:::myapp-backup-${secondaryRegion}`,
        storageClass: s3.StorageClass.STANDARD_IA,
      },
      filter: {
        prefix: '',
      },
    },
  ],
})

// Disaster recovery Lambda function
const disasterRecoveryFunction = new lambda.Function(this, 'DisasterRecovery', {
  runtime: lambda.Runtime.NODEJS_20_X,
  code: lambda.Code.fromAsset('../lambda/disaster-recovery'),
  handler: 'index.handler',
  timeout: cdk.Duration.minutes(15),
  environment: {
    PRIMARY_REGION: this.region,
    SECONDARY_REGION: secondaryRegion,
    BACKUP_BUCKET: replicatedBucket.bucketName,
  },
})

Performance Optimizationh2

11. Performance Monitoring and Optimizationh3

Monitor and optimize performance:

// Lambda performance optimization
const optimizedFunction = new lambda.Function(this, 'OptimizedFunction', {
  runtime: lambda.Runtime.NODEJS_20_X,
  code: lambda.Code.fromAsset('../lambda/optimized'),
  handler: 'index.handler',
  memorySize: 1024, // Increased memory for better CPU allocation
  timeout: cdk.Duration.seconds(30),
  reservedConcurrentExecutions: 50,
  environment: {
    NODE_OPTIONS: '--enable-source-maps --stack-trace-limit=1000',
  },

  // Provisioned concurrency for predictable performance
  provisionedConcurrentExecutions: 10,

  // Dead letter queue for failed invocations
  deadLetterQueue: new sqs.Queue(this, 'FailedInvocationsDLQ', {
    queueName: 'MyApp-FailedInvocations',
    retentionPeriod: cdk.Duration.days(14),
  }),
})

// ElastiCache for Redis with read replicas
const redisCluster = new elasticache.CfnReplicationGroup(this, 'RedisCluster', {
  replicationGroupId: 'myapp-redis',
  replicationGroupDescription: 'Redis cluster for caching',
  engine: 'redis',
  engineVersion: '7.0',
  cacheNodeType: 'cache.t3.micro',
  numCacheClusters: 2,
  automaticFailoverEnabled: true,
  multiAzEnabled: true,
  cacheSubnetGroupName: cacheSubnetGroup.ref,
  securityGroupIds: [redisSecurityGroup.securityGroupId],

  // Read replicas for read-heavy workloads
  numNodeGroups: 1,
  replicasPerNodeGroup: 2,
})

Conclusionh2

Cloud architecture with AWS offers incredible power and flexibility, but success depends on proper design and implementation. The patterns and practices I’ve shared here provide a solid foundation for building scalable, secure, and cost-effective cloud applications.

Key takeaways:

Infrastructure as Code for consistency and automation
Serverless architecture for cost efficiency
Multi-region design for high availability
Security-first approach with defense in depth
Comprehensive monitoring for observability
Cost optimization through right-sizing and automation

Remember, cloud architecture is an iterative process. Start simple, measure everything, and continuously optimize based on real-world usage patterns.

What cloud architecture challenges are you facing? Which AWS services have worked best for your use cases? Share your experiences!