ruk·si

☁️ Cloud Infrastructure
Basics

Updated at 2020-10-08 13:57

Some basic tips and tricks related to cloud providers like AWS, Azure, GCP, etc.

Have your infrastructure as code (IaC). This allows defining and testing your setup in a single place. It helps to keep your infrastructure maintainable; e.g. deploy to alternative locations, version control your setup, rollback changes. Most cloud providers have their own template languages (e.g. AWS CloudFormatio Templates) for defining infrastructure, but I'd recommend using more open solutions.

Terraform = cloud-agnostic infrastructure provisioning tool

# configurational management
Ansible = you define how systems relate to each other
Chef = you define "steps" how to get to desired state
Puppet = you define "state" and Puppet generates steps how to get there
Saltstack = like Puppet

Chef, Puppet = pull and execute
Saltstack, Ansible = push to execute

# provider-specific IaC tools
AWS CloudFormation
Azure Resource Manager
Google Cloud Deployment Manager

Test your IaC definitions. All tools should have a dry-run mode, or some mock/test library; even a simple lint or smoke test is fine. Also setup this to your project CI (e.g. CircleCI).

Give identity to incoming requests and triggered events. Tracking a series of connected events between cloud services is tedious without giving an identity to the requests. This request identity should be recorded through the whole lifecycle of the event, at least in a way that you can backtrack to "what triggered this event" if required.

Use centralized logging. One of your users reports an error that happened. You would like to see the low-level logs what was happening on the environment during that time but you may have no way of knowing which of your hundreds of servers caused the error.

AWS CloudWatch
Azure Monitor
GCP Stackdriver Logging

Monitor application health. How to know if your application is up an running? All cloud providers have multiple ways to setup health checks (e.g. AWS Route 53 health checks) that alert if something isn't working. After you have this information, you can make it self-repairing, but still trigger at least a warning email/message.

Sources