# Manipulating Data Using ORM

Until the previous chapter, we focused on utilizing queries from the CORE perspective.

In this chapter, we explain the components, lifecycle, and interaction methods of the Session used in the ORM approach.


# Inserting rows with ORM

The Session object, when using ORM, creates Insert objects and emits them in transactions. Session adds object entries to perform these processes. Then, through a process called flush, it records the new items in the database.

# Instances of Objects Representing Rows

In the previous process, we executed INSERT using a Python Dictionary.

In ORM, we directly use user-defined Python objects defined in the table metadata definition.

>>> squidward = User(name="squidward", fullname="Squidward Tentacles")
>>> krabs = User(name="ehkrabs", fullname="Eugene H. Krabs")

We create two User objects that represent potential database rows to be INSERTed. Because of __init__() constructor automatically created by ORM mapping, we can create each object using the constructor's column names as keys.

>>> squidward
User(id=None, name='squidward', fullname='Squidward Tentacles')

Similar to Core's Insert, ORM integrates it even without including a primary key. The None value for id is provided by SQLAlchemy to indicate that the attribute does not have a value yet.

Currently, the two objects (squiward and krabs) are in a state called transient. The transient state means they are not yet connected to any database and not yet associated with a Session object that can generate an INSERT statement.

# Adding Objects to the Session

>>> session = Session(engine)   # It is essential to close after use.
>>> session.add(squidward)      # Insert an object into session via Session.add() method.
>>> session.add(krabs)

When an object is added to the Session through Session.add(), it is called being in the pending state. The pending state means the object has not yet been added to the database.

>>> session.new   # You can check the objects in the pending state through session.new. Objects are added to the Session using the Session.add() method.
IdentitySet([User(id=None, name='squidward', fullname='Squidward Tentacles'), User(id=None, name='ehkrabs', fullname='Eugene H. Krabs')])
  • IdentitySet is a Python set that hashes object IDs in all cases.
  • That is, it uses the id() method, not the hash() function of Python's built-in functions."

# Flushing

The Session object uses the unit of work pattern (opens new window). This means that it accumulates changes but does not actually communicate with the database until necessary. This behavior allows objects in the previously mentioned pending state to be used more efficiently in SQL DML. The process of actually sending the current changes to the Database via SQL is called flushing.

>>> session.flush()
"""
INSERT INTO user_account (name, fullname) VALUES (?, ?)
[...] ('squidward', 'Squidward Tentacles')
INSERT INTO user_account (name, fullname) VALUES (?, ?)
[...] ('ehkrabs', 'Eugene H. Krabs')
"""

Now, the transaction remains open until one of Session.commit(), Session.rollback(), or Session.close() is invoked.

While you can use Session.flush() directly to push the current pending contents, Session typically features autoflush, so this is usually not necessary. Session.commit() flushes changes every time it is called.

# Automatically Generated Primary Key Properties

When a row is inserted, the Python object we created becomes persistent. The persistent state is associated with the loaded Session object.

During INSERT, the ORM retrieves the primary key identifier for each new object. This uses the same CursorResult.inserted_primary_key accessor introduced earlier.

>>> squidward.id
4
>>> krabs.id
5

When ORM is flushed, instead of executemany, two separate INSERT statements are used because of this CursorResult.inserted_primary_key. In SQLite, for instance, you need to INSERT one column at a time to use the auto-increment feature (other various databases like PostgreSQL's IDENTITY or SERIAL function similarly). If a database connection like psycopg2, which can provide primary key information for many rows at once, is used, the ORM optimizes this to INSERT many rows at once."

# Identity Map

Identity Map (ID Map) is an in-memory storage that links all objects currently loaded in memory to their primary key IDs. You can retrieve one of these objects through Session.get(). This method searches for the object in the ID Map if it's in memory, or through a SELECT statement if it's not.

>>> some_squidward = session.get(User, 4)
>>> some_squidward
User(id=4, name='squidward', fullname='Squidward Tentacles')

An important point is that the ID Map maintains unique objects among Python objects.

>>> some_squidward is squidward 
True

The ID Map is a crucial feature that allows manipulation of complex object sets within a transaction in an unsynchronized state.

# Committing

We now commit the current changes to the transaction.

>>> session.commit()
COMMIT

# How to UPDATE ORM objects

There are two ways to perform an UPDATE through ORM:

  1. Using the unit of work pattern employed by Session. UPDATE operations for each primary key with changes are sent out in sequence.
  2. Known as "ORM usage update", where you can explicitly use the Update construct with Session."

# Updating changes automatically

>>> sandy = session.execute(select(User).filter_by(name="sandy")).scalar_one()
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = ?
[...] ('sandy',)
"""

This 'Sandy' user object acts as a proxy for a row in the database, more specifically, for the row with primary key 2 from the transaction's perspective.

>>> sandy
User(id=2, name='sandy', fullname='Sandy Cheeks')
>>> sandy.fullname = "Sandy Squirrel"   # When an object's attribute is changed, the Session records this change.
>>> sandy in session.dirty              # Such changed objects are referred to as 'dirty' and can be checked in session.dirty.
True

When the Session executes flush, an UPDATE is executed in the database, actually updating the values in the database. If a SELECT statement is executed afterwards, a flush is automatically executed, allowing you to immediately retrieve the updated name value of Sandy through SELECT.

>>> sandy_fullname = session.execute(
...     select(User.fullname).where(User.id == 2)
... ).scalar_one()
"""
UPDATE user_account SET fullname=? WHERE user_account.id = ?
[...] ('Sandy Squirrel', 2)
SELECT user_account.fullname
FROM user_account
WHERE user_account.id = ?
[...] (2,)
"""
>>> print(sandy_fullname)
Sandy Squirrel
# Using the flush, Sandy's changes are actually reflected in the database,
# causing the object to lose its 'dirty' status.
>>> sandy in session.dirty 
False

# ORM usage update

The last method to perform an UPDATE through ORM is to explicitly use 'ORM usage update'. This allows you to use a general SQL UPDATE statement that can affect many rows at once.

>>> session.execute(
...     update(User).
...     where(User.name == "sandy").
...     values(fullname="Sandy Squirrel Extraordinaire")
... )
"""
UPDATE user_account SET fullname=? WHERE user_account.name = ?
[...] ('Sandy Squirrel Extraordinaire', 'sandy')
"""
<sqlalchemy.engine.cursor.CursorResult object ...>

If there are objects in the current Session that match the given conditions, the corresponding update will also be reflected in these objects.

>>> sandy.fullname
'Sandy Squirrel Extraordinaire'

# How to Delete ORM objects

You can mark individual ORM objects for deletion using the Session.delete() method. Once delete is executed, objects in that Session become expired.

>>> patrick = session.get(User, 3)
"""
SELECT user_account.id AS user_account_id, user_account.name AS user_account_name,
user_account.fullname AS user_account_fullname
FROM user_account
WHERE user_account.id = ?
[...] (3,)
"""
>>> session.delete(patrick)     # Indicate that patrick will be deleted
>>> session.execute(
...     select(User)
...     .where(User.name == "patrick")
... ).first()                   # Execute flush at this point
"""
SELECT address.id AS address_id, address.email_address AS address_email_address,
address.user_id AS address_user_id
FROM address
WHERE ? = address.user_id
[...] (3,)
DELETE FROM user_account WHERE user_account.id = ?
[...] (3,)
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = ?
[...] ('patrick',)
"""
>>> squidward in session # Once expired in the Session, the object is removed from the session.
False

Like the 'Sandy' used in the above UPDATE, these actions are only within the ongoing transaction and can be undone at any time unless committed.

# ORM usage delete

Like UPDATE, there is also 'ORM usage delete'.

# This is just an example, not a necessary operation for delete.
>>> squidward = session.get(User, 4)
"""
SELECT user_account.id AS user_account_id, user_account.name AS user_account_name,
user_account.fullname AS user_account_fullname
FROM user_account
WHERE user_account.id = ?
[...] (4,)
"""

>>> session.execute(delete(User).where(User.name == "squidward"))
"""
DELETE FROM user_account WHERE user_account.name = ?
[...] ('squidward',)
<sqlalchemy.engine.cursor.CursorResult object at 0x...>
"""

# Rolling Back

Session has a Session.rollback() method to roll back the current operations. This method affects Python objects like the aforementioned sandy. Calling Session.rollback() not only rolls back the transaction but also turns all objects associated with this Session into expired status. This state change triggers a self-refresh the next time the object is accessed, a process known as lazy loading.

>>> session.rollback()
ROLLBACK

Looking closely at sandy, which is in the expired state, you can see that no other information remains except for special SQLAlchemy-related status objects.

>>> sandy.__dict__
{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x...>}
>>> sandy.fullname      # Since the session is expired, accessing the object properties will trigger a new transaction.
"""
SELECT user_account.id AS user_account_id, user_account.name AS user_account_name,
user_account.fullname AS user_account_fullname
FROM user_account
WHERE user_account.id = ?
[...] (2,)
"""
'Sandy Cheeks'
>>> sandy.__dict__    # Now you can see that the database row is also filled in the sandy object.
{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x...>,
 'id': 2, 'name': 'sandy', 'fullname': 'Sandy Cheeks'}

For the deleted objects, you can see that they are restored in the Session and appear again in the database.

>>> patrick in session
True
>>> session.execute(select(User).where(User.name == 'patrick')).scalar_one() is patrick
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = ?
[...] ('patrick',)
"""
True

# Closing the Session

We have handled the Session outside of the context structure, and in such cases, it is good practice to explicitly close the Session as follows:

>>> session.close()
ROLLBACK

Similarly, when a Session created through a context statement is closed within the context statement, the following actions are performed.

  • Cancel all ongoing transactions (e.g., rollbacks) to release all connection resources to the connection pool.
    • This means you don't need to explicitly call Session.rollback() to check if the transaction was rolled back when closing the Session after performing some read-only operations with it. The connection pool handles this.
  • Remove all objects from the Session.
    • This means that all Python objects loaded for this Session, such as sandy, patrick, and squidward, are now in a detached state. For instance, an object that was in the expired state is no longer associated with a database transaction to refresh data due to a Session.commit() call, and it does not contain the current row's state.
    • >>> squidward.name
      Traceback (most recent call last):
      ...
      sqlalchemy.orm.exc.DetachedInstanceError: Instance <User at 0x...> is not bound to a Session; attribute refresh operation cannot proceed
      
    • Detached objects can be reassociated with the same or a new Session using the Session.add() method, re-establishing the relationship with a specific database row.
    • >>> session.add(squidward)    # Reconnect to the session
      >>> squidward.name            # Retrieve the information through the transaction again.
      """
      SELECT user_account.id AS user_account_id, user_account.name AS user_account_name, user_account.fullname AS user_account_fullname
      FROM user_account
      WHERE user_account.id = ?
      [...] (4,)
      """
      'squidward'
      

Objects in the detached state should ideally be avoided. When a Session is closed, it cleans up references to all previously connected objects. Typically, the need for detached objects arises in web applications when an object has just been committed and the Session is closed before it is rendered in a view. In this case, set the Session.expire_on_commit flag to False.