GitXplorerGitXplorer
r

Dynochemy

public
7 stars
1 forks
0 issues

Commits

List of commits on branch master.
Unverified
42ab805473cf8ec67556dbdc558346d70c9b962d

Error handling

rrhettg committed 12 years ago
Unverified
e28a172dca8db3a000285578b6ab3d689859fe5a

Correctly handle errors in GetAndDelete

rrhettg committed 12 years ago
Unverified
ef1f59469f990c02c3eeabc993056411e7a51ddc

Bump a bigger version number

rrhettg committed 12 years ago
Unverified
e242b0e7b6de14bd860cdbd6d6938e1903100214

Fix counter test for new update

rrhettg committed 12 years ago
Unverified
62dff8b4438d3d5ec79e88cdf887e8f5832d1cfb

Careful about Solvent's unpacking results

rrhettg committed 12 years ago
Unverified
32071ab51960181dbace9d6077e9cf06c0ebad8d

expose Defer

rrhettg committed 12 years ago

README

The README file for this repository.

dynochemy: Clever pythonic and async interface to Amazon DynamoDB

Yet another python inteface for Amazon Dynamo. The API is somewhat inspired by SQLAlchemy.

Features

  • Synchronous and Async Support (using Tornado)
  • Full API abstraction (rather than needing to know internals of how Dynamo actually needs it's JSON formatted.
  • High-level constructs for dealing with provisioning throughput limits and maintaining secondary indexes.
  • SQL-backend for testing

Status

Under heavy development, just barely functioning. I would stay away except for experimental use or you really want to get involved with development yourself.

Really just useful for seeing how the API would work if this library was completed

Known TODOs

  • Solvent: Needs scan support
  • Views: Need the ability to regenerate views from scratch, or even partial rewrites to help with failure scenarios.
  • Solvent: Need throttling support rather than always hammer until provisioning error.

Example Use

Describe your tables:

class MyTable(dynochemy.Table):
    name = 'my_table'
    hash_key = 'user'
    range_key = 'time'

    read_capacity = 1000
    write_capacity = 50

Connect to your database:

db = dynochemy.DB(ACCESS_KEY, ACCESS_SECRET)
db.register(MyTable)

This is using a table with both Hash and Range keys (user and time).

create_time = time.time()

db.MyTable.put({'user': '123', 'time': create_time, 'full_name': 'Rhett Garber'})

print db.MyTable.get(('123', create_time))

This does a synchronous operations on a dictionary like database. But wait, there is more:

d1 = db.MyTable.put_defer({'user': '123', 'time': create_time, 'full_name': 'Rhett Garber'})
d2 = db.MyTable.put_defer({'user': '124', 'time': create_time, 'full_name': 'Rhettly Garber'})
d3 = db.MyTable.put_defer({'user': '125', 'time': create_time, 'full_name': 'Rhettford Garber'})
defer.wait_all([d1, d2, d3])

This does 3 puts simulatenously. If within an existing tornado environment, you can do something like:

db = dynochemy.DB(ACCESS_KEY, ACCESS_SECRET, ioloop=IOLoop.instance())

user = yield tornado.gen.Task(db.MyTable.get_async, ('123', create_time))

(If you're not familiar with the brilliant tornado.gen stuff, you should be.

Querying has a nice API for it as well:

result = db.MyTable.query('123').reverse().limit(2)()
for item in result:
    print item['full_name']

And of course you do this async as well:

result = yield tornado.gen.Task(db.query('123').range(t0, t1).async())

One of the great features of DynamoDB is how it can do atomic counters using the 'update' command. Dynochemy can support that too:

db.MyTable.update((hash_key, range_key), add={'counter_1': 1, 'counter_2': 1}, put={'time_modified': time.time()})

This would update the indicated element (or create it if it doesn't exist), increment (or create) the counters and then set the time modified field.

High-level Dynochemy: Solvent

The above sample API is your simple, direct access to the database. However, in production use, I've found that dealing with errors, multiple requests and the like really cries for a high-level abstraction.

So, we have what are called 'Solvents'.

Solvents are an abstraction which generates a list of operations to be completed. It figures out how to combine all those operations into individual requests, runs them against a database and deals with throttling for you.

Example:

s = dynochemy.Solvent()

s.put(MyTable, entity_1)
s.put(MyTable, entity_2)
s.put(MyTable, entity_3)

get_op_1 = s.get(MyTable, '123')
get_op_2 = s.get(MyTable, '124')

result = s.run(db)

entity = result[get_op_1]
print entity

This example will run simulataneously, a BatchWrite and a BatchGetItem request. If part of the batch fails because of too many writes, it will be transparently retried a few times, after a delay. Each call to add an operation returns an operation object that can be used as a key to get the resulting values and errors.

Views

In addition to intelligently combining operations together for better performance, a solvent can maintain 'views'.

A view is another table full of calculated, or processed data, based on data in the original table. So for example, if you wanted to create counts of how many entities in your table had a certain value, you could create a view.

class CountTable(Table):
    table_name = 'count_table'
    hash_key = 'value'

class EntityValueCountView(View):
    table = EntityTable
    view_table = CountTable

    @classmethod
    def add(cls, entity):
        return [UpdateOperation(cls.view_table, entity['value'], add={'count': 1})]

    @classmethod
    def remove(cls, entity):
        return [UpdateOperation(cls.view_table, entity['value'], add={'count': -1})]

db.register(CountTable)
db.register(EntityValueCountView)

After registering the view, any writes to the 'EntityTable' using a solvent will also be run through your view implementation, allowing the secondary table be kept in perfect sync.

SQL-Backed Dynochemy

Primarily useful for testing, you can point Dynochemy at a SQLALchemy compatable database and use the same API.

engine = sqlalchemy.create_engine('sqlite:///')
db = dynochemy.SQLDB(engine)

This SQL backend is pretty fully functional, right down to simulating provisioning throughput calculations.

Keep in mind, that async operations are behind the scenes syncronous, as there are no good SQL apis for that.