Comprehensive guide to designing cloud-native applications and AWS infrastructure with real-world patterns and cost optimization

Cloud Architecture and AWS Best Practices: Building Scalable Infrastructureh1
Hello! I’m Ahmet Zeybek, a full stack developer with extensive experience in cloud architecture and AWS infrastructure. Moving to the cloud has transformed how we build and scale applications, offering unprecedented flexibility and power. In this comprehensive guide, I’ll share the patterns and practices that have helped me design cost-effective, scalable, and reliable cloud architectures.
Cloud Architecture Fundamentalsh2
1. Well-Architected Frameworkh3
AWS’s five pillars of well-architected design:
Operational Excellenceh4
- Automate everything: Infrastructure as Code (IaC)
- Monitor and log: Comprehensive observability
- Incident response: Runbooks and automation
Securityh4
- Defense in depth: Multiple security layers
- Least privilege: Minimal required permissions
- Encryption everywhere: Data at rest and in transit
Reliabilityh4
- Fault tolerance: Design for failure
- Auto scaling: Handle traffic spikes
- Disaster recovery: Multi-region backup
Performance Efficiencyh4
- Right-sized resources: Don’t over-provision
- Caching strategies: Reduce database load
- CDN usage: Global content delivery
Cost Optimizationh4
- Demand-based scaling: Pay only for what you use
- Resource optimization: Right-size instances
- Storage tiering: Use appropriate storage classes
Infrastructure as Codeh2
2. AWS CDK for Infrastructureh3
Modern infrastructure provisioning:
import * as cdk from 'aws-cdk-lib'import * as ec2 from 'aws-cdk-lib/aws-ec2'import * as rds from 'aws-cdk-lib/aws-rds'import * as lambda from 'aws-cdk-lib/aws-lambda'import * as apigateway from 'aws-cdk-lib/aws-apigateway'import * as cloudfront from 'aws-cdk-lib/aws-cloudfront'import * as s3 from 'aws-cdk-lib/aws-s3'
export class MyAppStack extends cdk.Stack { constructor(scope: Construct, id: string, props?: cdk.StackProps) { super(scope, id, props)
// VPC with proper networking const vpc = new ec2.Vpc(this, 'MyAppVPC', { maxAzs: 3, natGateways: 1, subnetConfiguration: [ { cidrMask: 24, name: 'Public', subnetType: ec2.SubnetType.PUBLIC, }, { cidrMask: 24, name: 'Private', subnetType: ec2.SubnetType.PRIVATE_WITH_NAT, }, { cidrMask: 24, name: 'Database', subnetType: ec2.SubnetType.PRIVATE_ISOLATED, }, ], })
// RDS PostgreSQL database const database = new rds.DatabaseInstance(this, 'MyAppDB', { engine: rds.DatabaseInstanceEngine.postgres({ version: rds.PostgresEngineVersion.VER_15 }), instanceType: ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.MICRO), vpc, vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_ISOLATED }, databaseName: 'myapp', allocatedStorage: 20, maxAllocatedStorage: 100, storageEncrypted: true, backupRetention: cdk.Duration.days(7), deletionProtection: true, monitoringInterval: cdk.Duration.seconds(60), enablePerformanceInsights: true, })
// Lambda functions const apiHandler = new lambda.Function(this, 'ApiHandler', { runtime: lambda.Runtime.NODEJS_20_X, code: lambda.Code.fromAsset('../lambda/dist'), handler: 'index.handler', timeout: cdk.Duration.seconds(30), memorySize: 512, environment: { DATABASE_URL: database.secret?.secretValueFromJson('connectionString').unsafeUnwrap()!, NODE_ENV: 'production', }, vpc, vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_NAT }, securityGroups: [createLambdaSecurityGroup(this, vpc)], })
// API Gateway const api = new apigateway.RestApi(this, 'MyAppAPI', { restApiName: 'MyApp API', description: 'API for MyApp', deployOptions: { stageName: 'prod', dataTraceEnabled: true, loggingLevel: apigateway.MethodLoggingLevel.INFO, metricsEnabled: true, }, })
// CloudFront distribution const distribution = new cloudfront.Distribution(this, 'MyAppCDN', { defaultBehavior: { origin: new origins.RestApiOrigin(api), viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS, compress: true, cachePolicy: cloudfront.CachePolicy.CACHING_OPTIMIZED, }, comment: 'CDN for MyApp', enabled: true, httpVersion: cloudfront.HttpVersion.HTTP2_AND_3, priceClass: cloudfront.PriceClass.PRICE_CLASS_ALL, })
// S3 bucket for static assets const assetsBucket = new s3.Bucket(this, 'AssetsBucket', { bucketName: `myapp-assets-${this.account}-${this.region}`, publicReadAccess: false, blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL, encryption: s3.BucketEncryption.S3_MANAGED, versioned: true, lifecycleRules: [ { id: 'Transition to IA', enabled: true, transitions: [ { storageClass: s3.StorageClass.INFREQUENTLY_ACCESSED, transitionAfter: cdk.Duration.days(30), }, { storageClass: s3.StorageClass.GLACIER, transitionAfter: cdk.Duration.days(90), }, ], }, ], })
// Outputs new cdk.CfnOutput(this, 'CDNURL', { value: `https://${distribution.distributionDomainName}`, description: 'CloudFront distribution URL', })
new cdk.CfnOutput(this, 'DatabaseEndpoint', { value: database.instanceEndpoint.hostname, description: 'Database endpoint', }) }}
function createLambdaSecurityGroup(scope: Construct, vpc: ec2.IVpc): ec2.SecurityGroup { const sg = new ec2.SecurityGroup(scope, 'LambdaSG', { vpc })
// Allow outbound to database sg.addEgressRule(ec2.Peer.ipv4(vpc.vpcCidrBlock), ec2.Port.tcp(5432), 'Allow database connections')
// Allow outbound to internet (for external APIs) sg.addEgressRule(ec2.Peer.anyIpv4(), ec2.Port.tcp(443), 'Allow HTTPS outbound')
return sg}
Serverless Architectureh2
3. Event-Driven Serverless Designh3
Build applications that respond to events:
// AWS Lambda handlersimport { DynamoDBClient } from '@aws-sdk/client-dynamodb'import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3'import { SNSClient, PublishCommand } from '@aws-sdk/client-sns'
const dynamoClient = new DynamoDBClient({})const s3Client = new S3Client({})const snsClient = new SNSClient({})
// Process file upload eventexport const processFileUpload = async (event: S3Event) => { for (const record of event.Records) { const bucket = record.s3.bucket.name const key = record.s3.object.key
try { // Get file metadata const fileData = await s3Client.send(new GetObjectCommand({ Bucket: bucket, Key: key }))
// Process file based on type if (key.endsWith('.csv')) { await processCSVFile(bucket, key) } else if (key.endsWith('.json')) { await processJSONFile(bucket, key) }
// Update processing status await updateProcessingStatus(key, 'completed')
// Send notification await snsClient.send( new PublishCommand({ TopicArn: process.env.NOTIFICATION_TOPIC_ARN, Message: JSON.stringify({ type: 'FILE_PROCESSED', fileKey: key, status: 'success', }), }) ) } catch (error) { console.error('File processing error:', error)
// Update status to failed await updateProcessingStatus(key, 'failed')
// Send error notification await snsClient.send( new PublishCommand({ TopicArn: process.env.NOTIFICATION_TOPIC_ARN, Message: JSON.stringify({ type: 'FILE_PROCESSING_ERROR', fileKey: key, error: error.message, }), }) ) } }}
// API Gateway handlerexport const apiHandler = async (event: APIGatewayEvent) => { const { httpMethod, path, body } = event
try { switch (`${httpMethod} ${path}`) { case 'GET /users': return await getUsers()
case 'POST /users': return await createUser(JSON.parse(body))
case 'GET /users/{id}': return await getUser(event.pathParameters?.id)
case 'PUT /users/{id}': return await updateUser(event.pathParameters?.id, JSON.parse(body))
case 'DELETE /users/{id}': return await deleteUser(event.pathParameters?.id)
default: return { statusCode: 404, body: JSON.stringify({ error: 'Not found' }), } } } catch (error) { console.error('API error:', error)
return { statusCode: 500, body: JSON.stringify({ error: 'Internal server error', message: process.env.NODE_ENV === 'development' ? error.message : 'Something went wrong', }), } }}
// Step Functions for complex workflowsexport const orderProcessingWorkflow = async (event: StepFunctionEvent) => { const { orderId } = event
try { // 1. Validate order await validateOrder(orderId)
// 2. Check inventory const inventoryAvailable = await checkInventory(orderId)
if (!inventoryAvailable) { await updateOrderStatus(orderId, 'CANCELLED') return { status: 'cancelled', reason: 'insufficient_inventory' } }
// 3. Process payment const paymentResult = await processPayment(orderId)
if (!paymentResult.success) { await updateOrderStatus(orderId, 'PAYMENT_FAILED') return { status: 'failed', reason: 'payment_failed' } }
// 4. Reserve inventory await reserveInventory(orderId)
// 5. Update order status await updateOrderStatus(orderId, 'CONFIRMED')
// 6. Send confirmation email await sendOrderConfirmation(orderId)
return { status: 'completed', orderId } } catch (error) { console.error('Order processing error:', error)
// Compensating actions await updateOrderStatus(orderId, 'FAILED') await releaseInventory(orderId)
throw error }}
Database Architectureh2
4. Multi-Region Database Designh3
Ensure high availability and disaster recovery:
// Aurora Global Database setupconst globalDatabase = new rds.DatabaseCluster(this, 'GlobalDB', { engine: rds.DatabaseClusterEngine.auroraPostgres({ version: rds.AuroraPostgresEngineVersion.VER_15, }), instances: 2, instanceProps: { instanceType: ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.LARGE), vpc, vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_ISOLATED }, }, storageEncrypted: true, backup: { retention: cdk.Duration.days(7), preferredWindow: '03:00-04:00', }, monitoringInterval: cdk.Duration.seconds(60), enablePerformanceInsights: true,})
// Read replicas in different regionsconst readReplicaUSWest2 = new rds.ClusterInstance(this, 'ReadReplicaUSW2', { instanceType: ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.LARGE), cluster: globalDatabase, promotionTier: 2,})
// ElastiCache for Redisconst redisCluster = new elasticache.CfnServerlessCache(this, 'RedisCluster', { engine: 'redis', serverlessCacheName: 'myapp-redis', description: 'Redis cluster for session storage and caching', securityGroupIds: [redisSecurityGroup.securityGroupId], subnetIds: vpc.privateSubnets.map((subnet) => subnet.subnetId), cacheUsageLimits: { dataStorage: { maximum: 10, unit: 'GB', }, ecpuPerSecond: { maximum: 10000, }, }, dailySnapshotTime: '05:00', majorEngineVersion: '7',})
Security Architectureh2
5. Zero-Trust Security Modelh3
Implement comprehensive security:
// IAM policies with least privilegeconst lambdaExecutionRole = new iam.Role(this, 'LambdaExecutionRole', { assumedBy: new iam.ServicePrincipal('lambda.amazonaws.com'), managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AWSLambdaBasicExecutionRole')], inlinePolicies: { DatabaseAccess: new iam.PolicyDocument({ statements: [ new iam.PolicyStatement({ effect: iam.Effect.ALLOW, actions: ['rds-db:connect'], resources: [database.secret?.secretArn!], }), ], }), S3Access: new iam.PolicyStatement({ effect: iam.Effect.ALLOW, actions: ['s3:GetObject', 's3:PutObject'], resources: [`${assetsBucket.bucketArn}/*`], }), },})
// VPC endpoints for secure accessconst dynamodbEndpoint = new ec2.GatewayVpcEndpoint(this, 'DynamoDBEndpoint', { service: ec2.GatewayVpcEndpointAwsService.DYNAMODB, vpc, subnets: [{ subnetType: ec2.SubnetType.PRIVATE_WITH_NAT }],})
const s3Endpoint = new ec2.GatewayVpcEndpoint(this, 'S3Endpoint', { service: ec2.GatewayVpcEndpointAwsService.S3, vpc, subnets: [{ subnetType: ec2.SubnetType.PRIVATE_WITH_NAT }],})
// Security groups with specific rulesconst databaseSecurityGroup = new ec2.SecurityGroup(this, 'DatabaseSG', { vpc, description: 'Security group for database', allowAllOutbound: false,})
// Only allow connections from application security groupdatabaseSecurityGroup.addIngressRule(applicationSecurityGroup, ec2.Port.tcp(5432), 'Allow PostgreSQL connections from application')
// WAF for API Gatewayconst webACL = new wafv2.CfnWebACL(this, 'MyAppWebACL', { name: 'MyAppWebACL', scope: 'REGIONAL', defaultAction: { block: {} }, rules: [ { name: 'RateLimit', priority: 1, action: { block: {} }, statement: { rateBasedStatement: { limit: 1000, aggregateKeyType: 'IP', }, }, visibilityConfig: { sampledRequestsEnabled: true, cloudWatchMetricsEnabled: true, metricName: 'RateLimitRule', }, }, { name: 'SQLInjection', priority: 2, action: { block: {} }, statement: { sqliMatchStatement: { fieldToMatch: { body: {} }, textTransformations: [ { priority: 0, type: 'LOWERCASE' }, { priority: 1, type: 'URL_DECODE' }, ], }, }, visibilityConfig: { sampledRequestsEnabled: true, cloudWatchMetricsEnabled: true, metricName: 'SQLInjectionRule', }, }, ], visibilityConfig: { sampledRequestsEnabled: true, cloudWatchMetricsEnabled: true, metricName: 'MyAppWebACL', },})
Monitoring and Observabilityh2
6. Comprehensive Monitoring Setuph3
Monitor your entire infrastructure:
// CloudWatch dashboardsconst dashboard = new cloudwatch.Dashboard(this, 'MyAppDashboard', { dashboardName: 'MyApp-Monitoring-Dashboard', defaultInterval: cdk.Duration.hours(24),})
// Add widgets to dashboarddashboard.addWidgets( new cloudwatch.GraphWidget({ title: 'API Gateway Latency', left: [ new cloudwatch.Metric({ namespace: 'AWS/ApiGateway', metricName: 'Latency', dimensionsMap: { ApiName: api.restApiName }, }), ], }),
new cloudwatch.GraphWidget({ title: 'Lambda Errors', left: [ new cloudwatch.Metric({ namespace: 'AWS/Lambda', metricName: 'Errors', dimensionsMap: { FunctionName: apiHandler.functionName }, }), ], }),
new cloudwatch.GraphWidget({ title: 'Database Connections', left: [ new cloudwatch.Metric({ namespace: 'AWS/RDS', metricName: 'DatabaseConnections', dimensionsMap: { DBInstanceIdentifier: database.instanceIdentifier }, }), ], }))
// CloudWatch alarmsconst highLatencyAlarm = new cloudwatch.Alarm(this, 'HighLatencyAlarm', { alarmName: 'MyApp-HighLatency', alarmDescription: 'API Gateway latency is too high', metric: new cloudwatch.Metric({ namespace: 'AWS/ApiGateway', metricName: 'Latency', dimensionsMap: { ApiName: api.restApiName }, }), threshold: 1000, // 1 second evaluationPeriods: 2, comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_THRESHOLD, treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,})
// SNS topic for notificationsconst alarmTopic = new sns.Topic(this, 'AlarmTopic', { topicName: 'MyApp-Alarms', displayName: 'MyApp Alarm Notifications',})
// Subscribe email to alarmsalarmTopic.addSubscription(new subscriptions.EmailSubscription('alerts@myapp.com'))
// Connect alarm to topichighLatencyAlarm.addAlarmAction(new actions.SnsAction(alarmTopic))
Cost Optimizationh2
7. Cost Optimization Strategiesh3
Reduce cloud costs while maintaining performance:
// Auto scaling configurationconst autoScalingGroup = new autoscaling.AutoScalingGroup(this, 'WebServerASG', { vpc, instanceType: ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.MICRO), machineImage: ec2.MachineImage.latestAmazonLinux2(), minCapacity: 1, maxCapacity: 10, desiredCapacity: 2, cooldown: cdk.Duration.minutes(5),
// Scale based on CPU utilization scalingPolicies: [ { scalingPolicyName: 'ScaleOut', scalingPolicyType: autoscaling.ScalingPolicyType.TARGET_TRACKING_SCALING, targetTrackingConfiguration: { predefinedMetricSpecification: { predefinedMetricType: autoscaling.PredefinedMetricType.ASGAverageCPUUtilization, }, targetValue: 70, }, }, ],
// Scheduled scaling for predictable traffic scheduledActions: [ { scheduledActionName: 'ScaleUpForBusinessHours', minSize: 3, maxSize: 8, desiredCapacity: 5, timeZone: 'America/New_York', schedule: autoscaling.Schedule.cron({ hour: '9', minute: '0' }), }, { scheduledActionName: 'ScaleDownAfterBusinessHours', minSize: 1, maxSize: 3, desiredCapacity: 2, timeZone: 'America/New_York', schedule: autoscaling.Schedule.cron({ hour: '18', minute: '0' }), }, ],})
// Spot instances for cost savingsconst spotFleet = new ec2.CfnSpotFleet(this, 'SpotFleet', { spotFleetRequestConfigData: { iamFleetRole: fleetRole.roleArn, allocationStrategy: 'diversified', targetCapacity: 10, spotPrice: '0.10', // Maximum spot price launchSpecifications: [ { instanceType: 'm5.large', ami: 'ami-12345678', keyName: 'my-key-pair', securityGroups: [webSecurityGroup.securityGroupId], subnetId: vpc.publicSubnets[0].subnetId, weightedCapacity: '1', }, ], },})
// S3 intelligent tieringconst intelligentTieringBucket = new s3.Bucket(this, 'IntelligentTieringBucket', { bucketName: `myapp-intelligent-${this.account}`, intelligentTieringConfigurations: [ { id: 'EntireBucket', prefix: '', tierings: [ { accessTier: s3.AccessTier.FREQUENT_ACCESS, days: 30, }, { accessTier: s3.AccessTier.INFREQUENT_ACCESS, days: 90, }, { accessTier: s3.AccessTier.ARCHIVE_ACCESS, days: 365, }, ], }, ],})
// Cost and usage reportconst costReport = new s3.Bucket(this, 'CostReportBucket', { bucketName: `myapp-cost-reports-${this.account}`, lifecycleRules: [ { id: 'DeleteOldReports', enabled: true, expiration: cdk.Duration.days(2555), // 7 years }, ],})
// Enable cost and usage reportnew cur.CfnReportDefinition(this, 'CostAndUsageReport', { reportName: 'MyAppCostAndUsageReport', timeUnit: 'DAILY', format: 'Parquet', compression: 'Parquet', additionalSchemaElements: ['RESOURCES'], s3Bucket: costReport.bucketName, s3Prefix: 'cost-reports', s3Region: this.region, refreshClosedReports: true,})
Multi-Region Architectureh2
8. Global Infrastructure Designh3
Build for global scale:
// Multi-region setupexport class GlobalStack extends cdk.Stack { constructor(scope: Construct, id: string, props: cdk.StackProps) { super(scope, id, props)
// Primary region (us-east-1) const primaryRegion = new cdk.Stack(scope, 'PrimaryRegion', { env: { region: 'us-east-1' }, })
// Secondary region (eu-west-1) const secondaryRegion = new cdk.Stack(scope, 'SecondaryRegion', { env: { region: 'eu-west-1' }, })
// Global resources const globalTable = new dynamodb.Table(this, 'GlobalTable', { tableName: 'MyApp-GlobalTable', partitionKey: { name: 'pk', type: dynamodb.AttributeType.STRING }, sortKey: { name: 'sk', type: dynamodb.AttributeType.STRING }, billingMode: dynamodb.BillingMode.PAY_PER_REQUEST, replicationRegions: ['us-east-1', 'eu-west-1', 'ap-southeast-1'], pointInTimeRecovery: true, })
// Route 53 for global routing const hostedZone = new route53.HostedZone(this, 'MyAppZone', { zoneName: 'myapp.com', })
// CloudFront with Lambda@Edge const distribution = new cloudfront.Distribution(this, 'GlobalCDN', { defaultBehavior: { origin: new origins.S3Origin(assetsBucket), edgeLambdas: [ { functionVersion: edgeFunction.currentVersion, eventType: cloudfront.LambdaEdgeEventType.ORIGIN_REQUEST, }, ], }, })
// Global health check const healthCheck = new route53.HealthCheck(this, 'GlobalHealthCheck', { fqdn: 'api.myapp.com', port: 443, type: route53.HealthCheckType.HTTPS, resourcePath: '/health', failureThreshold: 3, requestInterval: cdk.Duration.seconds(30), }) }}
DevOps and Automationh2
9. CI/CD Pipeline with AWSh3
Automated deployment pipeline:
name: Deploy to AWS
on: push: branches: [main] workflow_dispatch:
env: AWS_REGION: us-east-1 NODE_ENV: production
jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4
- name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm'
- name: Install dependencies run: npm ci
- name: Run tests run: npm run test
- name: Build application run: npm run build
- name: Upload build artifacts uses: actions/upload-artifact@v4 with: name: build-artifacts path: dist/
deploy-infrastructure: needs: test runs-on: ubuntu-latest if: github.ref == 'refs/heads/main'
steps: - uses: actions/checkout@v4
- name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: ${{ secrets.AWS_ROLE_ARN }} aws-region: ${{ env.AWS_REGION }}
- name: Setup CDK uses: aws-actions/setup-aws-cdk@v1
- name: Install CDK dependencies run: npm ci
- name: Deploy to AWS run: | cdk bootstrap cdk deploy --require-approval never
deploy-application: needs: deploy-infrastructure runs-on: ubuntu-latest
steps: - uses: actions/checkout@v4
- name: Download build artifacts uses: actions/download-artifact@v4 with: name: build-artifacts path: dist/
- name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: ${{ secrets.AWS_ROLE_ARN }} aws-region: ${{ env.AWS_REGION }}
- name: Deploy Lambda functions run: | # Update Lambda function code aws lambda update-function-code \ --function-name MyAppApiHandler \ --s3-bucket myapp-deployment-bucket \ --s3-key lambda-functions/api-handler.zip
# Update API Gateway aws apigateway update-stage \ --rest-api-id ${{ secrets.API_GATEWAY_ID }} \ --stage-name prod \ --patch-op replace \ --patch-path deploymentId \ --patch-value ${{ secrets.DEPLOYMENT_ID }}
smoke-tests: needs: deploy-application runs-on: ubuntu-latest
steps: - name: Run smoke tests run: | # Health check curl -f https://api.myapp.com/health
# Basic API tests curl -f -X POST https://api.myapp.com/test \ -H "Content-Type: application/json" \ -d '{"test": "data"}'
Disaster Recoveryh2
10. Backup and Recovery Strategyh3
Ensure business continuity:
// Automated backup strategyconst backupPlan = new backup.BackupPlan(this, 'MyAppBackupPlan', { backupPlan: { backupPlanName: 'MyApp-BackupPlan', backupPlanRules: [ { ruleName: 'DailyBackups', targetBackupVault: backupVault, scheduleExpression: events.Schedule.cron({ hour: '2', minute: '0' }), lifecycle: { deleteAfter: cdk.Duration.days(30), }, }, { ruleName: 'WeeklyBackups', targetBackupVault: backupVault, scheduleExpression: events.Schedule.cron({ weekDay: 'SUN', hour: '3', minute: '0' }), lifecycle: { deleteAfter: cdk.Duration.days(90), }, }, ], },})
// Backup vault with encryptionconst backupVault = new backup.BackupVault(this, 'MyAppBackupVault', { backupVaultName: 'MyApp-BackupVault', encryptionKey: kmsKey, accessPolicy: new iam.PolicyDocument({ statements: [ new iam.PolicyStatement({ effect: iam.Effect.ALLOW, principals: [new iam.AccountPrincipal(this.account)], actions: ['backup:*'], resources: ['*'], }), ], }),})
// Cross-region replication for S3const replicatedBucket = new s3.Bucket(this, 'ReplicatedBucket', { bucketName: `myapp-replicated-${this.account}`, replicationRules: [ { id: 'CrossRegionReplication', status: s3.ReplicationStatus.ENABLED, destination: { bucket: `arn:aws:s3:::myapp-backup-${secondaryRegion}`, storageClass: s3.StorageClass.STANDARD_IA, }, filter: { prefix: '', }, }, ],})
// Disaster recovery Lambda functionconst disasterRecoveryFunction = new lambda.Function(this, 'DisasterRecovery', { runtime: lambda.Runtime.NODEJS_20_X, code: lambda.Code.fromAsset('../lambda/disaster-recovery'), handler: 'index.handler', timeout: cdk.Duration.minutes(15), environment: { PRIMARY_REGION: this.region, SECONDARY_REGION: secondaryRegion, BACKUP_BUCKET: replicatedBucket.bucketName, },})
Performance Optimizationh2
11. Performance Monitoring and Optimizationh3
Monitor and optimize performance:
// Lambda performance optimizationconst optimizedFunction = new lambda.Function(this, 'OptimizedFunction', { runtime: lambda.Runtime.NODEJS_20_X, code: lambda.Code.fromAsset('../lambda/optimized'), handler: 'index.handler', memorySize: 1024, // Increased memory for better CPU allocation timeout: cdk.Duration.seconds(30), reservedConcurrentExecutions: 50, environment: { NODE_OPTIONS: '--enable-source-maps --stack-trace-limit=1000', },
// Provisioned concurrency for predictable performance provisionedConcurrentExecutions: 10,
// Dead letter queue for failed invocations deadLetterQueue: new sqs.Queue(this, 'FailedInvocationsDLQ', { queueName: 'MyApp-FailedInvocations', retentionPeriod: cdk.Duration.days(14), }),})
// ElastiCache for Redis with read replicasconst redisCluster = new elasticache.CfnReplicationGroup(this, 'RedisCluster', { replicationGroupId: 'myapp-redis', replicationGroupDescription: 'Redis cluster for caching', engine: 'redis', engineVersion: '7.0', cacheNodeType: 'cache.t3.micro', numCacheClusters: 2, automaticFailoverEnabled: true, multiAzEnabled: true, cacheSubnetGroupName: cacheSubnetGroup.ref, securityGroupIds: [redisSecurityGroup.securityGroupId],
// Read replicas for read-heavy workloads numNodeGroups: 1, replicasPerNodeGroup: 2,})
Conclusionh2
Cloud architecture with AWS offers incredible power and flexibility, but success depends on proper design and implementation. The patterns and practices I’ve shared here provide a solid foundation for building scalable, secure, and cost-effective cloud applications.
Key takeaways:
- Infrastructure as Code for consistency and automation
- Serverless architecture for cost efficiency
- Multi-region design for high availability
- Security-first approach with defense in depth
- Comprehensive monitoring for observability
- Cost optimization through right-sizing and automation
Remember, cloud architecture is an iterative process. Start simple, measure everything, and continuously optimize based on real-world usage patterns.
What cloud architecture challenges are you facing? Which AWS services have worked best for your use cases? Share your experiences!
Further Readingh2
- AWS Well-Architected Framework
- AWS CDK Documentation
- Serverless Framework
- AWS Cost Optimization
- Cloud Native Computing Foundation
This post reflects my experience as of October 2025. AWS services and best practices evolve rapidly, so always verify the latest documentation and regional availability.