ruk·si

☁️ AWS
S3

Updated at 2016-02-22 20:44

S3 is an object storage service by Amazon where an object is data + metadata

You can store unlimited amount of data to S3. But you can have up to 100 buckets. Bucket identifies must be globally unique so it's good to prefix your buckets with your company name or your personal nickname.

Maximum S3 key length is 1024 UTF-8 characters. Just something to keep in mind. Key is essentially the "path" and the "filename".

S3 provides eventual consistency. If you upload a file and it succeeds, the file is safe. But downloading the just updated file might still return the old version for a second.

S3 files are backed up automatically by AWS. If a file is uploaded to S3, you don't need to create backups for it, it is safe.

Ensure even distribution for better performance. Use random characters or reversed timestamps at the start of all keys. Start of the key specifies in which partition is the object stored in. Storing too many objects into the same partition will limit I/O performance. Three first characters are the most important ones.

# bad I/O performance
image1.png
image2.png
logs/service_log.2012-02-27-23.hostname1.mydomain.com
logs/service_log.2012-02-27-23.hostname2.mydomain.com

# good I/O performance
a18-image1.png
ff3-image2.png
c4e/logs/service_log.2012-02-27-23.com.mydomain.hostname1
4lr/logs/service_log.2012-02-27-23.com.mydomain.hostname2

# the last two can be queried with [0-f][0-f][0-f]/logs/ regex

Utilize CloudFront. Use CloudFront content delivery service if you have a lot of GETs. Reduces download time.

Utilize Glazier. Use Glacier archiving service if data is rarely read. For example, move backups from S3 to Glacier after 30 days. Reduces operational costs.

You can make S3 automatically send objects to Glacier with lifecycle rules.

S3 > Select bucket > Properties > Lifecycle > Add rule
Choose does the rule apply to the whole bucket, a specific folder or file.
Choose Archive Only and how many days to wait before sending to Glacier.
If you select 0 days, file will be archived in 24 hours.

Consider using pre-signed URLs for user S3 uploads. You can allow users to upload straight to S3 with pre-signed urls. http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlUploadObject.html

S3 file access is controlled 3 ways:

  • IAM policies
  • S3 bucket policies
  • S3 access control lists (ACL)

Final authorization is the union of IAM policy, S3 bucket policy and S3 ACL. Decisions default to deny and any explicit deny trumps allow. In other words, if no method specifies deny and one or more specifies allow, the request will be allowed; or if one method specifies deny and two specifies allow, the request will be denied.

IAM policies should be used if possible. They are the easiest ones to manage and shared between multiple users using IAM infrastructure. But they are not good controlling public read access.

S3 bucket policies allow conditional access.

  • Only serve the file if the request originated from a certain domain (referer).
  • Allow uploads made from another specific AWS account.
  • Grant access to CloudFront Origin Identity.

S3 ACLs allow entity level access control. Each bucket and object can have its own ACL. Owner of the entity always has the full access. For some reason, ACLs are dubbed as "Permissions" on the web interface.

private             = no one but the owner can access (default)
public-read         = AllUsers group can read
public-read-write   = AllUsers group can read and write (avoid using)
authenticated-read  = AuthenticatedUsers (any AWS Account) group can read
log-delivery-write  = LogDelivery group can write and read the ACL
aws-exec-read       = EC2 gets read access, used for AMI bundles

Bucket policy example:

{
  "Version":"2012-10-17",
  "Statement":[
    {
      "Sid":"1",
      "Effect":"Allow",
      "Principal": "*",
      "Action":["s3:GetObject"],
      "Resource":["arn:aws:s3:::your-bucket/*"]
    }
  ]
}
aws s3api put-bucket-policy \
    --bucket your-bucket \
    --policy file://path/to/policy.json
# or by web interface

S3 can be used to host a static website. S3 also provides a few extensions to just serving static files:

  • You can define custom index and error documents.
  • You can define redirects.
  • You can set a custom domain.

S3 can be hosted on a custom domain configured on Route 53.

  • CNAME must match the bucket name e.g. static.example.com
  • Must be a subdomain, not the root like example.com

Writing to S3 can be faster than writing to AWS EC2's Block storage. Depends on your use-case, worth trying it out.

Sources

  • Amazon Dev Day Casual Connect, San Francisco, 10th of August, 2015
  • AWS in Action, Michael Wittig and Andreas Wittig