Using Amazon S3 and S3Express command line you can upload very large files to a S3 bucket efficiently (e.g. several megabytes or even multiple gigabytes).
The main issues with uploading large files over the Internet are:
- The upload could be involuntarily interrupted by a transient network issue and if that happens, the whole upload could fail and it would need to be restarted for the beginning. If the file is very large, it would result in wasted time and bandwitdh.
- Being a large file, the upload could be voluntarily interrupted by the user with the intent of being continued at a later stage. In that case again, the whole upload would need to be restarted from the beginning.
- Being the upload one big file, only one thread at a time can be used to upload the file and that would make the transfer quite slow.
All of the above issues are solved using multipart uploads.
By specifying the flag -mul of the command put when uploading files, S3Express will break the files into chunks (by default each chunk will be 5MB) and upload them separately.
You can instruct S3Express to upload a number of chunks in parallel using the flag -t.
If the upload of one single chunk fails, for whatever reason, or if the upload is interrupted, you can simply restart the uncompleted upload and S3Express will restart from the last successful chunk instead of having to re-upload the entire file. If you do not want to restart an unfinished multipart upload, you can use the command rmupl to remove the uncompleted upload.
Once all chucks are uploaded, the file is reconstructed at the destination to exaclty match the origin file. S3Express will also recaclulate and apply the correct MD5 value.
The multipart upload feature in S3Express makes it very convenient to upload very large files to Amazon S3, even over less reliable network connections, using the command line.