NerdVana

We're not anti-social; we're just not user friendly

Deleting very large S3 buckets

Written by Eric Schwimmer

Surprisingly, there is no way to just delete an S3 bucket. You must first delete all of the objects contained within the bucket, and then delete the bucket itself. Normally this is just a minor inconvenience, but when the bucket in questions contains hundreds of millions of objects, you run into a serious problem.

The tools that Amazon provides you with to delete files are simply not capable of deleting files at this scale. We had a bucket with 400 million+ objects (including revisions) that we had to delete. First we tried deleting them through the AWS console/web UI, but that just hung our browser indefinitely (and didn't delete any of the files). Next we tried with the aws CLI tool, but given the rate that it was deleting files, it would have taken six months to finish deleting them (during which time we would be, of course, paying for their storage).

There were some other tools out there (written in Ruby, sadly) which looked promising, but which failed to perform. The best of them reduced the overall delete time to just over a month, and didn't supported versioned objects (which means that we would have been left to delete the old versions by hand).

So I wrote my own: s3wipe


comments powered by Disqus