Scalable User Uploads with Amazon S3

Sooner or later your app will probably need to store files being uploaded by your users. If you have a small user group and users are uploading small files, you may not care too much about your upload architecture, but if your app is successful, keeping file uploads performant as the app scales can become extremely challenging. This is especially true if your users are distributed geographically or need to upload very large files.

The Architecture

So how do you design a scalable uploads architecture? You don’t; you leverage an existing scalable system like Amazon S3. Your architecture looks like this:

The key point in the architecture is that uploads never pass through your service. Web or mobile clients simply need to get a signed upload URL from your service and upload files directly to that URL. The upload URL is signed so that unauthorized clients cannot write arbitrary files into your S3 bucket(s) and your service doesn’t have to deal with receiving, processing or storing the uploads.

The Code

This is a simplistic example of getting the presigned url and other fields necessary to allow a client to upload to S3. This code would be placed in a server side route that likely requires that the user has authenticated and is authorized to upload. The server side code must have an IAM credential that is authorized to upload to the target bucket. For workloads running in AWS an excellent way to provide this credential is to assign an IAM role to the thing (EC2 instance, Lambda function, etc) that runs this code.

const AWS = require('aws-sdk');
const s3 = new AWS.S3();

var params = {
  Bucket:'bucket',
  Fields: [
    key: 'key'
  ]
};

s3.createPresignedPost(params, function(err, data) {
  if (err) {
    console.error('Presigning post data encountered an error', err);
  } else {
    // data.url contains the upload form action
    // data.fields contains a hash of fields you must include in your upload form
    return data;
  }
});

But I need to do stuff with the upload!

The architecture I’ve described is all well and good you say, but I need to resize images, read metadata from the file, etc. Are we back to having to upload the file to our service? NO.

First let me say that there may be examples of complex processing of uploads where a Lambda function is insufficient, however in the vast majority of cases you can solve your upload processing with Lambda. If you need to do transcoding of video files AWS can help you with that too, but we’re going to focus on doing something with an upload file after upload using SNS and Lambda.

So I was cheating a bit with the first architecture diagram, here’s the rest:

After your client has uploaded a file to S3, it can notify your service. Your service in turn pushes an event to an SNS topic. The event contains information like the S3 bucket and key so that your Lambda function can get the file and do stuff. Here’s an example Lambda function that resizes an image and saves it back to S3.

const sharp = require('sharp');
const AWS = require('aws-sdk');
const s3 = new AWS.S3();

exports.handler = async (event) => {
  const { bucket, key } = JSON.parse(event.Records[0].Sns.Message);

  // Get the image from S3
  var imageData = awaits3.getObject({
    Bucket: bucket,
    Key: key
  }, (err) => {
    if (err) {
      console.log(err);
    }
  }).promise();

  // Resize the image with the sharp lib
  const newHeight = 100;
  const newWidth = 100;
  const resizedImage = await sharp(imageData.Body)
    .resize(newHeight, newWidth)
    .toBuffer();

  // Save the resized image back to the same location
  return awaits3.putObject({
    Body: resizedImage,
    Bucket:bucket,
    Key:key,
  }).promise();
};

Summary

Amazon S3 and Lambda allow all of us to build scalable file uploads. Just make sure you keep things scalable by avoiding having your service process upload files unnecessarily. Uploads should go directly to S3 and if you need post processing use Lambda.

Brandon knows web development, from automated deployments to your favorite cloud hosting provider to building a great user experience in the browser and all the parts between.

Leave a comment

Add your comment here