The Most Important Page In All App Engine Land
http://code.google.com/appengine/docs/python/datastore/functions.html
For my first post I'll talk about what I think is a very useful docs page that is sort of buried in the Python App Engine docs. It's a page that describes all of the functions available in the google.appengine.ext.db package. These are extremely useful functions that I find most App Engine beginners never discover or never use. I'm only going to talk about some of the functions, but all of them can be really useful. I'm not going to talk about run_in_transaction() since that function is covered in the Transactions section of the App Engine docs.
Functions like delete(), delete_async(), get(), get_async(), put(), and put_async() can really help improve the latency in your app. These are functions that let you edit entities in bulk, which can really speed up your datastore requests by sending only one request (RPC) to the datastore instead of many.
Typical case:
employees_trained = Employee.gql("WHERE account IN :1", training_registration_list) for e in employees_trained: e.new_hire_training_completed = True e.put()
Now, instead of doing an individual put() for each employee, we can create create a list of entities to put later all at once:
employees_trained = Employee.gql("WHERE account IN :1", training_registration_list) to_save = [] for e in employees_trained: e.new_hire_training_completed = True to_save.append(e) db.put(to_save)
If the number of employees to modify is high and you're saving them individually then it can really slow down your page loads. Editing them in bulk removes a lot of the overhead of many trips back and forth with the datastore. You can even save entities of different model types. Just append any to be updated model instances to the to_save list and call db.put() all at once.
To save even more time you can use the async functions. These will asynchronously start reading/writing to the datastore while the code after it gets to execute. This is so key when you're doing a lot of processing in one page request.
Quick example:
employees_trained = Employee.gql("WHERE account IN :1", training_registration_list) to_save = [] for e in employees_trained: e.new_hire_training_completed = True to_save.append(e) employee_async = db.put_async(to_save) ... # Do a bunch of stuff. ... # Now check for async result at end of code to guarantee it was written. employee_async.check_success()
Next up, model_to_protobuf() and model_from_protobuf() can be really useful when trying to serialize an entity to save either in memcache or some other datastore. We specifically use it to store model instances in memcache.
Here's an example of how to use these methods to set and get model instances from memcache for an Employee model instance:
employee = None pb_employee = memcache.get(memcache_key) if pb_employee: employee = db.model_from_protobuf(entity_pb.EntityProto(pb_employee)) if not employee: employee = Employee.get('<key>') memcache.set(memcache_key, db.model_to_protobuf(employee).Encode())
Finally, allocate_ids(), allocate_ids_async(), and allocate_id_range() can be useful for when you need to know what the datastore IDs of entities will be before you create them or to avoid conflicts when transferring data across datastores. Bill Katz gives a good explanation of how those functions are useful for datastore backup and restore (link)
That's it! You should always check back on that page. Google keeps adding more and more useful functions to the db package.
For my next App Engine post I'll talk about how to use custom models to handle large OR conditional queries like how to bulk lookup every entity that matches each of a list of different values.











