SQLAlchemy

17 Notes
+ Query Examples (Oct. 27, 2015, 2:24 p.m.)

from sqlalchemy import or_ session = get_session() cdr_records = session.query(CDR) print('Total records: %s' % len(cdr_records.all())) --------------------------------------------------------------------------------------------- cdr_records = cdr_records.filter(CDR.calldate.between(data['from_call_date'], data['to_call_date'])) print('Dates: %s' % len(cdr_records.all())) ------------------------------------------------------------------------------------------- cdr_records = cdr_records.filter(or_(CDR.dst.like('%%%s%%' % prefixes[0]), CDR.dst.like('%%%s%%' % prefixes[1]))) --------------------------------------------------------------------------------------------- cdr_records = cdr_records.filter(CDR.disposition.in_(request.POST.getlist('disposition'))) ---------------------------------------------------------------------------------------------

+ filter vs. filter_by (Dec. 19, 2014, 4:34 p.m.)

filter_by uses keyword arguments, whereas filter allows pythonic filtering arguments like filter(User.name=="john") saved_category = session.query(GrammarCategory).filter_by(title=EncodeAES(DB_HASH, category)).first() From a blog: We actually had these merged together originally, i.e. there was a "filter"-like method that accepted *args and **kwargs, where you could pass a SQL expression or keyword arguments (or both). I actually find that a lot more convenient, but people were always confused by it, since they're usually still getting over the difference between column == expression and keyword = expression. So we split them up. filter_by is used for simple queries on the column names like db.users.filter_by(name='Joe') The same can be accomplished with filter by writing db.users.filter(db.users.name=='Joe') but you can also write more powerful queries containing expressions like db.users.filter(or_(db.users.name=='Ryan', db.users.country=='England'))

+ Field Types (Dec. 7, 2014, 1:36 a.m.)

from sqlalchemy import Column, DateTime, String, Integer, ForeignKey, func from sqlalchemy.orm import relationship, backref from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() class Department(Base): __tablename__ = 'department' id = Column(Integer, primary_key=True) name = Column(String) employees = relationship( 'Employee', secondary='department_employee_link' ) class Employee(Base): __tablename__ = 'employee' id = Column(Integer, primary_key=True) name = Column(String) hired_on = Column( DateTime, default=func.now()) departments = relationship( Department, secondary='department_employee_link' ) class DepartmentEmployeeLink(Base): __tablename__ = 'department_employee_link' department_id = Column(Integer, ForeignKey('department.id'), primary_key=True) employee_id = Column(Integer, ForeignKey('employee.id'), primary_key=True) --------------------------------------------------------------------------------------------- class Department(Base): __tablename__ = 'department' id = Column(Integer, primary_key=True) name = Column(String) class Employee(Base): __tablename__ = 'employee' id = Column(Integer, primary_key=True) name = Column(String) # Use default=func.now() to set the default hiring time # of an Employee to be the current time when an # Employee record was created hired_on = Column(DateTime, default=func.now()) department_id = Column(Integer, ForeignKey('department.id')) # Use cascade='delete,all' to propagate the deletion of a Department onto its Employees department = relationship( Department, backref=backref('employees', uselist=True, cascade='delete,all')) ---------------------------------------------------------------------------------------------

+ Good Tutorial Websites (Oct. 10, 2014, 2:39 a.m.)

http://www.rmunn.com/sqlalchemy-tutorial/tutorial.html

+ Joins (Oct. 10, 2014, 1:46 a.m.)

Now this is the part where people get constantly confused with SQLAlchemy but fear not, I have you covered. Django hides the business of joins from you. For instance if you want to get all posts written by a specific author that is known by name you would do something like this: posts = Post.objects.filter(author__name__exact=the_author_name) So how do you do that in SQLAlchemy? The answer is that this means a join is taking place. There are two ways to model that select. First the simple one: posts = Post.query.join(Author).filter(Author.name == the_author_name) That wasn't too tricky. How does SQLAlchemy know how to do the right thing? It looks at what joins are possible and if only one is, it selects the right one. Alternatively you can explicitly provide what to join on as an expression as second argument to join(). Again, you can get arbitrarily complex there. Everything after the join automatically operates on the last .join()-ed model. If you want to further filter the former model (here Post) you can either move them before the .join() call or use .reset_joinpoint(). Alternatively you could also express this as a subselect: author_query = Author.query.filter(Author.name == the_author_name) posts = Post.query.filter(Post.author_id.in_(author_query)) Why does SQLAlchemy not do what Django does? Well, first of all explicit is better than implicit: you know exactly what happens. A regular join is not always what you want or SQL would not provide an outerjoin which of course you can use with SQLAlchemy as well. Secondly, it's really easy to replicated. If you are curious of how that works you can have a look at this subclass of the builtin query that implements Django's filtering with keyword arguments: sqlalchemy-django-query: https://github.com/mitsuhiko/sqlalchemy-django-query/

+ Aggregates (Oct. 10, 2014, 1:45 a.m.)

Aggregates in Django are a quite new feature and generally not all that awesome, so we're skipping the Django part here. Thankfully they are much better supported in SQLAlchemy as SQLAlchemy just handles them by querying over arbitrary expressions. Functions on the database can be expressed by sqlalchemy.func.functionname in SQLAlchemy. This in combination with arbitrary expressions makes it quite potent. But first the simple case: from sqlalchemy.sql import func q = session.query(func.count(User.id)) Now that query obviously does not resolve to a model but a scalar value. In this case if we would call q.first() we would get a single tuple back with a single item: the count. For this case SQLAlchemy provides a nice shortcut: scalar(): >>> session.query(func.count(User.id)).scalar() 1337 What if we want to group by something? Use group_by() and just iterate over it: for age, count in session.query(User.age, func.count(User.id)).group_by(User.age).all(): print 'Users aged %d: %d' % (age, count) Distinct counts are simple as well, just call .distinct() on the query. In fact: if you have a rough idea of what the SQL would look like you can get to the expected result with pure guesswork and SQLAlchemy will most likely “just work” ™.

+ Sorting (Oct. 10, 2014, 1:45 a.m.)

In Django if you sort something you do that by calling order_by() and passing it some strings with the columns to order by: forwards = MyModel.objects.order_by('pub_date') backwards = MyModel.objects.order_by('-pub_date') While it appears that the same is possible in SQLAlchemy you have to be careful because it only works as SQLAlchemy inserts that text directly into the query. What instead you want to be doing is using the expressions again: forwards = MyModel.query.order_by(MyModel.pub_date) backwards = MyModel.query.order_by(MyModel.pub_date.desc()) And again, any expression works in that situation, so you can just easily order by ridiculous expressions if you want.

+ Date Based Queries (Oct. 10, 2014, 1:45 a.m.)

In Django you can use field__year=2011 to select all entries where the year of a field has a specific value. Underneath what usually happens is that an EXTRACT expression is issued. Unfortunately that's hugely database dependent and does not map nicely to a function. Thankfully SQLAlchemy provides a helper for that which automatically does the right thing for each database: from sqlalchemy.sql import extract entries_a_month = Entry.query.filter( (extract(Entry.pub_date, 'year') == 2011) & (extract(Entry.pub_date, 'month') == 1) ).all() Quite a few extractions are possible. The most common ones are month, day, year, hour, minute, second, doy (day of year) and dow (day of week).

+ General Query Syntax (Oct. 10, 2014, 1:44 a.m.)

If you want to filter a query in Django you generally use keyword arguments in the format column__operation=value. For instance column__contains='e' to check if a string column named column contains the letter “e”. In SQLAlchemy instead you are using expressions. These expressions can be printed to see what query they would generate. Here some examples: >>> print MyModel.id == 23 model.model_id = :model_id_1 >>> print MyModel.id.in_([1, 2, 3]) model.model_id IN (:model_id_1, :model_id_2, :model_id_3) >>> print MyModel.name.contains('e') model.name LIKE '%%' || :name_1 || '%%' Note that SQLAlchemy shows you the placeholders there because it will let the database insert those values later. The whole expression language expresses pretty much everything that SQL has to offer: >>> print MyModel.thread_count + MyModel.post_count + 1 (model.thread_count + model.post_count) + :param_1 >>> print MyModel.id.between(1, 10) & MyModel.name.startswith('a') model.model_id BETWEEN :model_id_1 AND :model_id_2 AND model.name LIKE :name_1 || '%%' Now this is a biggie, because this is how you can filter for anything if you pass such an expression to filter(): active_users_with_a_or_b = User.query.filter( (User.name.startswith('a') | User.name.startswith('b')) & (User.is_active == True) ).all() To evaluate a query you have a few choices: first() returns the first result from the query and will also tell the database to perform an implicit LIMIT 1. If more than one result is found you won't know and if none is found you get None back. one() is similar to first() but it will not limit the result in any way but perform a sanity check on getting the results. It will raise an NoResultFound exception back if it did not found a single row or a MultipleResultsFound exception if it got more than one result which indicates a bug on your part. all() just evaluates the whole query and returns each row as a list. Why as a list and not as an iterator? First of all because each object returned is also immediately registered on the session. There are of course ways to bypass that, but unless you have an enormous result count you won't notice, secondly because most Python database adapters don't support streaming results anyways. Now this is nice and everything, but all that model repetition can be annoying. For as long as you are just comparing a column to a given value you can use the filter_by() function and pass keyword arguments: user = User.query.filter_by(username=username).first() Multiple arguments are automatically joined with AND.

+ Primary Key Queries (Oct. 10, 2014, 1:43 a.m.)

Queries is where Django and SQLAlchemy are the most different. Django uses keyword arguments to the query functions to filter the query, SQLAlchemy generally uses expressions composed out operator objects. Query by primary key in Django: obj = MyModel.objects.get(pk=the_id) And in SQLAlchemy: obj = MyModel.query.get(the_id) Note that get() returns None if the primary key does not exist in SQLAlchemy and will raise a DoesNotExist exception in Django. Generally the get() method is a shortcut in SQLAlchemy that will also not issue a query for that object if it was already queried for that session before. Also unlike Django your primary key can be of any type or be a compound of more than one column.

+ Updating Entries (Oct. 10, 2014, 1:42 a.m.)

obj = MyModel.query.get(the_id) obj.name = 'New Value' session.commit() OR session.query(Grammar).filter_by(id=1).update({'content': 'New Content'}) session.commit()

+ Deleting Entries (Oct. 10, 2014, 1:42 a.m.)

Deleting works very much like saving in Django. You get your object and then call the delete() method on it: obj = MyModel.objects.filter(pk=the_id).get() obj.delete() In SQLAlchemy that operation is performed via the session: obj = MyModel.query.get(the_id) session.delete(obj) Again, remember to commit your session.

+ Inserting Entries (Oct. 10, 2014, 1:40 a.m.)

Inserting entries in Django can be done with either creating an instance of a model or by using the create() method of the object manager: foo = MyModel(field1='value', field2='value') foo.save() # or alternatively foo = MyModel.objects.create(field1='value', field2='value') In SQLAlchemy you need to do this instead: foo = MyModel(field1='value', field2='value') session.add(foo) But with that you have only added the object to the session, at that point it has not yet committed the transaction. This has to be done explicitly by yourself when you are happy with all the changes: session.commit()

+ Many-To-Many (Oct. 10, 2014, 1:40 a.m.)

Many to many relationships in Django are easy cake because everything is done for you: class Topping(models.Model): name = models.CharField(max_length=30) class Pizza(models.Model): toppings = models.ManyToManyField(Topping) name = models.CharField(max_length=30) In SQLAlchemy we have to construct a helper table to join over: from sqlalchemy import Column, Integer, String, ForeignKey, Table from sqlalchemy.orm import relationship, backref pizza_toppings = Table('pizza_toppings', Base.metadata, Column('topping_id', Integer, ForeignKey('toppings.id')), Column('pizza_id', Integer, ForeignKey('pizzas.id')) ) class Topping(Base): __tablename__ = 'toppings' id = Column(Integer, primary_key=True) name = Column(String(30)) class Pizza(models.Model): __tablename__ = 'pizzas' id = Column(Integer, primary_key=True) name = Column(String(30)) toppings = relationship('Topping', secondary=pizza_toppings, backref=backref('pizzas', lazy='dynamic'))

+ Many-to-One Relationships (Oct. 10, 2014, 1:38 a.m.)

In Django this is straightforward: class Manufacturer(models.Model): name = models.CharField(max_length=30) class Car(models.Model): manufacturer = models.ForeignKey(Manufacturer, related_name='cars') name = models.CharField(max_length=30) In SQLAlchemy we have to be a little bit more expressive: from sqlalchemy import Column, Integer, String, ForeignKey from sqlalchemy.orm import relationship, backref class Manufacturer(Base): __tablename__ = 'manufacturers' id = Column(Integer, primary_key=True) name = Column(String(30)) class Car(models.Model): __tablename__ = 'cars' id = Column(Integer, primary_key=True) manufacturer_id = Column(Integer, ForeignKey('manufacturers.id')) name = Column(String(30)) manufacturer = relationship('Manufacturer', backref= backref('cars', lazy='dynamic')) Here we have to model the relationship ourself. First we need to declare the foreign key. It has to have the same type as the primary key of the table we want to point to and additionally the column needs to be given a ForeignKey instance with the first argument being the dotted name to the column referenced. Note that this is the table name, not the class name. The relationship is then declared on Car with relationship. The first argument is a class or the name of a class we want to have the relationship with. By default it will try to find a valid join condition automatically. If it does not, you can explicitly provide one as a string or real expression: manufacturer = relationship('Manufacturer', primaryjoin='Car.manufacturer_id == Manufacturer.id', backref=backref('cars', lazy='dynamic')) The backref argument automatically declares the reverse. It will attach a cars property on the manufacturer. The lazy='dynamic' tells SQLAlchemy to make the backref lazy and a dynamic loading one. In that case accessing manufacturer.cars will be a query object you can further refine instead of directly firing the query and returning a list. Other lazy settings: 'select': if accessed load everything as list with another select statement. This is the default. 'joined': uses a join to automatically load that backref with the query of the parent itself. 'dynamic': returns a query object instead of firing the query. This can be sliced and further extended. The lazy settings can also be set on relationship and not just backref. Backref in a nutshell: 'lazy' and 'select'. The first one fires a query when honda.cars is accessed, the other one will fetch it when honda is queried: >>> honda.cars [<Car 1>, <Car 2>] And here with 'dynamic': >>> honda.cars <AppenderQuery ...> >>> honda.cars.all() [<Car 1>, <Car 2>]

+ Basic Models (Oct. 10, 2014, 1:31 a.m.)

from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() A basic Django model looks something like this: class Person(models.Model): first_name = models.CharField(max_length=30) last_name = models.CharField(max_length=30) The equivalent SQLAlchemy model with declarative base looks like this: from sqlalchemy import Column, Integer, String class Person(Base): __tablename__ = 'persons' id = Column(Integer, primary_key=True) first_name = Column(String(30)) last_name = Column(String(30))