Skip to content

Added options for Amazon S3 Multipart Upload

I searched a plugin which supports real multipart / chunked uploads. Ok its a bit specific because AWS S3 wants something special. Then I found Dropzone! 🥳

The implementation was good but some parts are not perfect - nothing is perfect So I decided to create a branch for this and work on the issues I must solve for real AWS S3 Multipart Upload. You can see here how its working: https://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html

The workflow is simple:

1.) When starting a file upload create an empty object (easily spoken)

2.) For each chunk / part create an upload URL (because clients should not see client id/secret) AND using this URL then for PUT Request

3.) Complete the upload when the chunks are uploaded successfully AND remember the ETag header of all the uploaded parts

So what did I did here in this request?

  1. Added defaultHeaders option Because AWS dont like additional headers like Accept, Cache-Control or X-Requested-With for CORS reasons. - I not enabled this headers for CORS!

  2. Added binaryBody option This is necessary because AWS S3 needs the parts/chunks as real binary and not with form data or something else. Only the plain/raw binary data. So I need to skip the FormData and using the binary directly.

  3. Added resetChunkRequest option How I can get the ETag header of the response when the response object will be null? I integrated this option but when I'm truly... I not used it. Because I added a response and responseHeaders property to the chunk object which can be used. But maybe there are usecases for having the XHR object in the chunk.

Here is an example how you can use Dropzone with AWS S3 Multipart Upload. (but only with this Merge Request or with my branch)

var partData = {},
    chunkCount = 1,
    myDropzone = new Dropzone("#dropzone", {
    url: function() {
        /** @ToDo: This could be improved when we can implement a wait for a promise and chunk data :-) */
        return partData[chunkCount++];
    },
    autoProcessQueue: false,
    maxFiles: 1,
    maxFilesize: 1024 * 1024, // 1 TB
    method: 'PUT',
    uploadMultiple: false,
    chunking: true,
    forceChunking: true,
    chunkSize: 1000000 * 10, // 10 MB (for testing, you can use up to 5 GB at the moment)
    parallelChunkUploads: false,
    retryChunks: true,
    retryChunksLimit: 3,
    params: (files, xhr, chunk) => {
        if (chunk) {
            return {};
        }
    },
    defaultHeaders: false,
    binaryBody: true
});

myDropzone.on('addedfile', function(file) {
    chunkCount = 1;
    const partCount = Math.ceil(file.size / myDropzone.options.chunkSize);

    /** CreateMultipartUpload */
    // http.post / jQuery.post / AWS.S3.CreateMultipartUpload

    /** Prepare all the chunks which we want to upload */
    const promises = [];
    for (let x = 1; x <= partCount; x++) {
        promises.push(
            // http.post / jQuery.post / AWS.S3.UploadPart <- Promise!!
            // new Promise(function(resolve) {
            //  UploadId: 'See step above with Create Multipart Upload and remember upload id'
            //  PartNumber: x
            //  resolve then with the upload url!
            // }).then(function(uploadUrl) { partData[x] = data; })
        );
    }
    Promise.all(promises).then(function() {
        myDropzone.processFile(file);
    });
});

/** When file is successfully uploaded then finalize the upload on API */
myDropzone.on('success', function(file) {
    var data = {
        UploadId: 'See step above with Create Multipart Upload and remember upload id',
        MultipartUpload: {
            Parts: file.upload.chunks.map(chunk => ({
                PartNumber: chunk.index + 1,
                ETag: chunk.responseHeaders.match(/ETag: "([^"]+)"/i)[1]
            }))
        }
    };
    // http.post / jQuery.post / AWS.S3.CompleteMultipartUpload
});

If you have questions feel free to ask me 👍

Merge request reports