Archive for August 2015

Bulk Updating S3 Files in F# with AWS .NET SDK

I’ve been working with a client to enhance an existing site the users could go to and create “playlists” of videos that would be burned to DVD and then mailed to the user. Now we’re adding the ability to download the files directly, which we’re storing on Amazon S3.

After uploading the files to my bucket and linking to them in the site, clicking on a link to the video file resulted in the browser playing the file instead of prompting the user to save the file.

The key to making this happen is the Content-Disposition: attachment header in the response of GET request for the file. Amazon does allow you to set this in the AWS Management Console, but not across multiple files simultaneous. Because we’ve got over 100 videos, I needed to automate it.

Enter the AWS .NET SDK and F#. Most of the code samples you’ll see for the AWS SDK for .NET are C#-based, but it works just as well in F#.

First things first, you’ll need to grab the AWSSDK NuGet package and reference it appropriately. Then we need to open a few namespaces from the SDK.

open Amazon  
open Amazon.S3  
open Amazon.S3.Model  
open Amazon.S3.IO

Next, we’ll need to set up our keys. These can be created from your AWS Management Console.

let accessKey = "YOUR_AWS_ACCESS_KEY"  
let secretKey = "YOUR_AWS_SECRET"

Then, let’s get our environment set up. I had to set Amazon.AWSConfigs.LoggingConfig.LogTo to LoggingOptions.SystemDiagnostics to prevent the SDK from complaining about log4net not being available.

// Set the AWS SDK to log using built-in .NET logging. 
Amazon.AWSConfigs.LoggingConfig.LogTo <- LoggingOptions.SystemDiagnostics;

// Instantiate a new S3 client
let client = new AmazonS3Client(accessKey, secretKey, RegionEndpoint.USEast1); 

Now that I’m connected, I want to get a list of files in my bucket.

let bucket = "my-bucket-name";  
let files = S3DirectoryInfo(client, bucket).GetFiles(); 

With each of my files, I want to configure it such that the Content-Disposition header will be set to attachment to force the browser to prompt for download.

Below, I declare a processFile function that will take an AmazonS3Client (our client from above) and each individual S3FileInfo object representing my files.

One thing worth noting is that S3 doesn’t support the idea of “updating” an object. To achieve this, you must make a “copy” of an object. There’s nothing to stop you from copying over the same object.


let processFile (client: AmazonS3Client, file: IO.S3FileInfo) = 

    // AWS requires you to "copy" a file in order to change details about it. Here we just copy over the original file.
    let copyRequest = new CopyObjectRequest(SourceBucket = bucket, 
                                            DestinationBucket = bucket, 
                                            SourceKey = file.Name, 
                                            DestinationKey = file.Name, 
                                            CannedACL = S3CannedACL.PublicRead);

    // I want the browser to prompt for a file download when a user requests this file.
    copyRequest.Headers.ContentDisposition <- "attachment"

    //
    // Perform any other updates here.
    //

    try
        printf "Processing: %s..." file.Name
        let resp = client.CopyObject(copyRequest);      // perform the "copy", which updates the file.
        printfn "Done!"
    with
        | ex -> printfn "Failed while processing!! %s" ex.Message

Finally, we iterate over our S3FileInfo array, sending each file along with our client into our new processFile function to perform the update.

files |> Array.iter(fun file -> processFile(client, file))

You’ll see the output in your console.

Processing My_First_Video.mp4...Done!  
Processing My_Second_Video.mp4...Done!  
...
Processing My_Final_Video.mp4...Done!  

This code updates files one-at-a-time, but this could be made parallel for even faster processing using the F# Parallel.iter function. I would’ve done this, but YAGNI.

/