GitXplorerGitXplorer
h

s3tup

public
19 stars
1 forks
2 issues

Commits

List of commits on branch master.
Unverified
3302af704764392198d464bb11b85b5ccc2f9af0

Update readme to explain nuances of redirects.

hheyimalex committed 11 years ago
Unverified
ca2bfa0dfaf78c41c609491d1523abae32fc5b8b

Make key methods more explit.

hheyimalex committed 11 years ago
Unverified
259253a39c50e97c72558a53c208b76494fa1422

Split Bucket.delete_keys into two methods.

hheyimalex committed 11 years ago
Unverified
0a3634d34bd7cd9ca842c25e8109819f7a665eeb

Split join into concurrent and liner join.

hheyimalex committed 11 years ago
Unverified
42689f632faea43e2c3497f03354392807a92244

Change how Connection.stats is initialized.

hheyimalex committed 11 years ago
Unverified
fc3df8d0a149bbb0449570fca85b31b63e4be754

Improve error handling in join ctx manager.

hheyimalex committed 11 years ago

README

The README file for this repository.

s3tup

Python package that offers configuration management and deployment for Amazon S3 through simple declarative yaml files.

Why?

Because writing custom scripts for configuring and deploying to S3 through boto was a major pain. Though tools like s3sync exist, they lack robust options for configuration and you often still need some customization or outside scripting to get them to do exactly what you want.

With s3tup configuration is straightforward. It uses etags to only upload and delete the files that need to be changed, just like many other tools around, but also supports syncing configurations to files that you've already uploaded, making your configurations truly declarative.

Installation

Install via pip:

$ pip install s3tup

Install from source:

$ git clone git://github.com/HeyImAlex/s3tup.git
$ cd s3tup
$ python setup.py

Usage

S3tup can be used as a command line tool or a python library. Just write out a config file (the following sets up a simple website):

# config.yml
---
- bucket: example-bucket
  rsync: /path/to/your/website
  key_config:
    - canned_acl: public-read
      reduced_redundancy: true
    - patterns: ['static/*']
      cache_control: 'max-age=32850000'
  website: |
    <WebsiteConfiguration xmlns='http://s3.amazonaws.com/doc/2006-03-01/'>
        <IndexDocument>
            <Suffix>index.html</Suffix>
        </IndexDocument>
        <ErrorDocument>
            <Key>404</Key>
        </ErrorDocument>
    </WebsiteConfiguration>

Set your AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY env vars and then run:

$ s3tup config.yml

Easy as that. The configuration file can be as simple or robust as you need, and there are a couple examples in the repo to help you out.

With the --rsync option, your deployments will only change what needs to be changed, and with --dryrun you can preview your changes before you actually commit to making them.

Alternatively you can use s3tup as a library within python.

from s3tup.connection import Connection
from s3tup.bucket import Bucket

conn = Connection()
b = Bucket(conn, 'test-bucket')
b.canned_acl = 'public-read'
b.sync()

Documentation here is lacking at the moment, but I'm working on it (and the source is a short read).

Config File

The s3tup configuration file is plain yaml. The base is a list of bucket configurations which are defined below. An example configuration is available here to help you and I'll try and keep it as up to date as possible. Because s3tup is just a thin wrapper over the S3 REST api, the best way to understand what all of these options actually do is to consult the online documentation for S3.

Note: Setting an option to None and not setting it at all are not the same thing. For many fields None will assert that the configuration option is not set at all.

Bucket Configuration

The bucket configuration is a dict that contains, predictably, the configuration options for the bucket named by the required field bucket. All other fields are optional.

field default description
bucket required The target bucket name.
region '' The region that the bucket is in. Valid values: EU, eu-west-1, us-west-1, us-west-2, ap-southeast-1, ap-southeast-2, ap-northeast-1, sa-east-1, empty string (for the US Classic Region). Note that a bucket's region cannot change; s3tup will raise an exception if the bucket already exists and the regions don't match.
canned_acl The canned acl of the bucket. Valid values: private, public-read, public-read-write, authenticated-read, bucket-owner-read, bucket-owner-full-control.
website The website configuration of the bucket. Valid values: Either a string xml website configuration (detailed on this page) or None which will delete the website configuration for this bucket all together.
acl The acl set on this bucket. Valid values: Either a string xml acl (detailed on this page) or None, which will set the defualt acl on the bucket.
cors The cors configuration of the bucket. Valid values: Either a string xml cors configuration (detailed on this page) or None which will delete the cors configuration for this bucket all together.
lifecycle The lifecycle configuration of the bucket. Valid values: Either a string xml lifecycle configuration (detailed on this page) or None which will delete the lifecycle configuration for this bucket all together.
logging The logging configuration of the bucket. Valid values: Either a string xml logging configuration (detailed on this page) or None which will delete the logging configuration for this bucket all together.
notification The notification configuration of the bucket. Valid values: Either a string xml notification configuration (detailed on this page) or None which will delete the notification configuration for this bucket all together.
policy The policy set on this bucket. Valid values: Either a string json policy (detailed on this page) or None which will delete the policy from this bucket all together.
requester_pays Boolean value that says whether to enable or disable requester pays.
tagging The tagging configuration of the bucket. Valid values: Either a string xml tagging configuration (detailed on this page) or None which will delete all tags from this bucket.
versioning Boolean value that says wether to enable or suspend versioning. Note: Once versioning is enabled on a bucket it cannot be disabled, only suspended! Any bucket that has ever had versioning enabled cannot have a lifecycle configuration set!
key_config Takes a list of key configuration dicts and applies them to all of the applicable keys in the bucket. See section Key Configuration for details.
rsync Takes either an rsync configuration dict or a list of them and "rsyncs" a folder with the bucket. See section Rsync Configuration for details.
redirects [ ] Takes a list of [key, redirect location] pairs. On sync the bucket will create a zero byte key and upload it. If there is something that has or that will be uploaded in that key location, an ActionConflict exception will be raised. Redirection of keys that need to actually hold a value must be done by creating a key config with a pattern that matches one key and adding the redirect_url field.

Key Configuration

The key configuration field allows you to define key configurations that apply to all keys matched by your matcher fields. These configurations are applied in the order that they appear, and conflicting fields will be overwritten by whichever configuration was applied last. The bucket configuration takes a list of key configurations, so you can have as many as you like. Keep in mind that many of these options are not idempotent; if you already have configuration set on an S3 key, s3tup will overwrite it when it syncs.

field default description
matcher fields See section Matcher Fields below.
reduced_redundancy False Boolean option to use reduced redundancy storage.
encrypted False Boolean option to use server side encryption.
canned_acl The canned acl for the key.
acl String xml acl policy for this key.
cache_control None String value of the cache-control header.
content_disposition None String value of the content-disposition header.
content_encoding None String value of the content-encoding header. S3tup will not guess content encoding.
content_language None String value of the content-language header.
content_type None String value of the content-type header. If not explicitly set, s3tup will make a best guess based on the extension.
expires None String value of the expires header.
metadata { } Dict of metadata headers to set on the key.

Rsync Configuration

The rsync field allows you to "rsync" a local folder with an S3 bucket. All keys that are uploaded are configured by any present key configurations. Remember that the rsync configuration definition contains the matcher fields and any local paths (relative to the synced directory) not matched will not be rsynced. This is helpfull for ignoring certain files or folders during rsync (and basically emulates the inclue/exclude/rinclude/rexclude options of s3cmd's sync). The matching process is run on the local pathname relative to src.

field default description
matcher fields See section Matcher Fields below.
src required Relative or absolute path to folder to rsync. Trailing slash is not important.
dest '' Optional, allows you to rsync with a specific folder on S3.
delete False Option to delete keys present in the bucket that are not present locally. Other rsyncs and redirects will override this if there are conflicts.

Matcher Fields

Both the key and rsync configuration definitions contain these optional fields to constrain which keys they act upon. These are intended to function as intuitively as possible, but in the name of explicitness:

If none of these fields are present, all keys are matched. If neither patterns nor regexes are present, all keys except those matched by ignore_patterns and ignore_regexes are matched. If either patterns or regexes are present, only keys that patterns or regexes match and are not matched by either ignore_patterns or ignore_regexes are matched. Whew.

Remember to always pass a list in!

field default description
patterns None List of unix style patterns to include
ignore_patterns None List of unix style patterns exclude
regexes None List of regex patterns to include
ignore_regexes None List of regex patterns exclude

Cli

positional arguments:

  • config - relative or absolute path to the config file

optional arguments:

  • -h, --help - show this help message and exit
  • --dryrun - show what will happen when s3tup runs without actually running s3tup
  • --rsync - only upload and delete modified and removed keys. no key syncing, no redirecting, no bucket configuring.
  • -c <concurrency> - the number of concurrent requests you'd like to make. anything below one runs linearly. defaults to 5.
  • -v, --verbose - increase output verbosity
  • -q, --quiet - silence all output
  • --access_key_id <access_key_id> - your aws access key id
  • --secret_access_key <secret_access_key> - your aws secret access key

TODO

This project is in early development and still has plenty of work before I can confidently say that it's production ready. However it's slowly getting there.

  • Need to gracefully handle sync of objects > 5GB
  • Larger test suite
  • Implement mfa delete
  • Better support for versioning