I am using SQLAlchemy (with a Postgresql database).
I defined a database model with the following structure:
Table Family
with the fields id
and address
.
Table Persons
with id
, name
, family_id
and parent_id
family_id
is a foreign key for table Family
and parent_id
is a self referencing foreign key for table Persons
. This is used to define persons with children, which also may have children. This gives me the ability to define hierarchical structures.
I want to a have an operation (for usage in a REST-API), where a family can be modified. The modification may consists of adding new members to the family and rearrange the hierarchical structure of them.
When persisting the modification I use SQLAlchemy session.merge(data)
function.
But this gives me the effect, when I am modifying the family by adding a new person and define this person as parent for one or more existing persons (i.e. making a new person "parent" of existsing persons), then I get the error, that parent_id does not exist in table "Persons". I assume this happens, because the merge want's to update the existsing entries (children) before inserting the new person (parent).
I can implement a workaround by manually splitting persons in parents and children before the merge operation. Then first merge all parents and after that merge all children. But I like to have a solution which works and the ORM Level (i.e. where the table Persons is defined).
Question
Is it possible to explicit set the order of objects when using session.merge()
To first insert new objects before updating existing ones? Either at model defintion or when calling the merge function.
I am using SQLAlchemy (with a Postgresql database).
I defined a database model with the following structure:
Table Family
with the fields id
and address
.
Table Persons
with id
, name
, family_id
and parent_id
family_id
is a foreign key for table Family
and parent_id
is a self referencing foreign key for table Persons
. This is used to define persons with children, which also may have children. This gives me the ability to define hierarchical structures.
I want to a have an operation (for usage in a REST-API), where a family can be modified. The modification may consists of adding new members to the family and rearrange the hierarchical structure of them.
When persisting the modification I use SQLAlchemy session.merge(data)
function.
But this gives me the effect, when I am modifying the family by adding a new person and define this person as parent for one or more existing persons (i.e. making a new person "parent" of existsing persons), then I get the error, that parent_id does not exist in table "Persons". I assume this happens, because the merge want's to update the existsing entries (children) before inserting the new person (parent).
I can implement a workaround by manually splitting persons in parents and children before the merge operation. Then first merge all parents and after that merge all children. But I like to have a solution which works and the ORM Level (i.e. where the table Persons is defined).
Question
Is it possible to explicit set the order of objects when using session.merge()
To first insert new objects before updating existing ones? Either at model defintion or when calling the merge function.
1 Answer
Reset to default 1Usually merge
is only used if you operating on models not in the session, such as bringing them back into the session from a cache. It is not for common usage but is more of an advanced tool because it is hard to get right. Usually most people just need to use add()
.
You should be able to do something like this:
import os
from sqlalchemy import (
Column,
Integer,
String,
create_engine,
ForeignKey,
)
from sqlalchemy.sql import (
select,
)
from sqlalchemy.orm import (
DeclarativeBase,
Session,
relationship,
selectinload,
)
def get_engine(env):
return create_engine(f"postgresql+psycopg2://{env['DB_USER']}:{env['DB_PASSWORD']}@{env['DB_HOST']}:{env['DB_PORT']}/{env['DB_NAME']}", echo=True)
class Base(DeclarativeBase):
pass
class Family(Base):
__tablename__ = 'families'
id = Column(Integer, primary_key=True)
address = Column(String, nullable=True)
persons = relationship('Person', back_populates='family')
class Person(Base):
__tablename__ = 'persons'
id = Column(Integer, primary_key=True)
family_id = Column(Integer, ForeignKey('families.id'), nullable=False)
parent_id = Column(Integer, ForeignKey('persons.id'), nullable=True)
family = relationship('Family', back_populates='persons')
parent = relationship('Person', back_populates='children', remote_side=[id])
children = relationship('Person', back_populates='parent', remote_side=[parent_id])
def run(conn):
with Session(conn) as db:
for family in db.scalars(select(Family).options(selectinload(Family.persons).selectinload(Person.children))):
print(f"Persons residing at {family.address}")
print("="*20)
q = [(0, person) for person in family.persons if not person.parent]
while q:
generation, person = q.pop()
print(f"{' '*generation*4}{person.id}")
q.extend([(generation+1, child) for child in person.children])
def populate(conn):
with Session(conn) as db:
smiths = Family(address='100 Main St')
johnsons = Family(address='101 Main St')
db.add_all([smiths, johnsons])
db.commit()
with Session(conn) as db:
for family in db.scalars(select(Family)):
grandpa = Person(family=family)
father = Person(parent=grandpa, family=family)
son = Person(parent=father, family=family)
babies = [Person(parent=son, family=family) for _ in range(5)]
# Add them to session first.
persons = [grandpa, father, son] + babies
db.add_all(persons)
# Now link them into the family.
family.persons.extend(persons)
db.commit()
with Session(conn) as db:
for family in db.scalars(select(Family).options(selectinload(Family.persons).selectinload(Person.children))):
# First-born of the son is now his new-found brother's child.
grandpa = [person for person in family.persons if not person.parent][0]
father = grandpa.children[0]
son = father.children[0]
baby0 = son.children[0]
sons_brother = Person(parent=father, family=family)
db.add(sons_brother)
baby0.parent = sons_brother
db.commit()
def main():
engine = get_engine(os.environ)
with engine.begin() as conn:
Base.metadata.create_all(conn)
populate(conn)
run(conn)
if __name__ == '__main__':
main()
adjacency-list-relationships
Sample Output
...skipping setup
2025-01-21 08:00:31,160 INFO sqlalchemy.engine.Engine INSERT INTO families (address) SELECT p0::VARCHAR FROM (VALUES (%(address__0)s, 0), (%(address__1)s, 1)) AS imp_sen(p0, sen_counter) ORDER BY sen_counter RETURNING families.id, families.id AS id__1
2025-01-21 08:00:31,161 INFO sqlalchemy.engine.Engine [generated in 0.00005s (insertmanyvalues) 1/1 (ordered)] {'address__0': '100 Main St', 'address__1': '101 Main St'}
2025-01-21 08:00:31,162 INFO sqlalchemy.engine.Engine SELECT families.id, families.address
FROM families
2025-01-21 08:00:31,162 INFO sqlalchemy.engine.Engine [generated in 0.00007s] {}
2025-01-21 08:00:31,164 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) VALUES (%(family_id)s, %(parent_id)s) RETURNING persons.id
2025-01-21 08:00:31,164 INFO sqlalchemy.engine.Engine [generated in 0.00008s] {'family_id': 1, 'parent_id': None}
2025-01-21 08:00:31,164 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) VALUES (%(family_id)s, %(parent_id)s) RETURNING persons.id
2025-01-21 08:00:31,165 INFO sqlalchemy.engine.Engine [cached since 0.0007021s ago] {'family_id': 1, 'parent_id': 1}
2025-01-21 08:00:31,165 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) VALUES (%(family_id)s, %(parent_id)s) RETURNING persons.id
2025-01-21 08:00:31,165 INFO sqlalchemy.engine.Engine [cached since 0.001127s ago] {'family_id': 1, 'parent_id': 2}
2025-01-21 08:00:31,166 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) SELECT p0::INTEGER, p1::INTEGER FROM (VALUES (%(family_id__0)s, %(parent_id__0)s, 0), (%(family_id__1)s, %(parent_id__1)s, 1), (%(family_id__2)s, %(parent_id__2)s, 2), (%(family_id__3)s, %(parent_id__3)s, 3), (%(family_id__4)s, %(parent_id__4)s, 4)) AS imp_sen(p0, p1, sen_counter) ORDER BY sen_counter RETURNING persons.id, persons.id AS id__1
2025-01-21 08:00:31,166 INFO sqlalchemy.engine.Engine [generated in 0.00003s (insertmanyvalues) 1/1 (ordered)] {'family_id__0': 1, 'parent_id__0': 3, 'family_id__1': 1, 'parent_id__1': 3, 'family_id__2': 1, 'parent_id__2': 3, 'family_id__3': 1, 'parent_id__3': 3, 'family_id__4': 1, 'parent_id__4': 3}
2025-01-21 08:00:31,167 INFO sqlalchemy.engine.Engine SELECT persons.id AS persons_id, persons.family_id AS persons_family_id, persons.parent_id AS persons_parent_id
FROM persons
WHERE %(param_1)s = persons.family_id
2025-01-21 08:00:31,167 INFO sqlalchemy.engine.Engine [generated in 0.00007s] {'param_1': 1}
2025-01-21 08:00:31,169 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) VALUES (%(family_id)s, %(parent_id)s) RETURNING persons.id
2025-01-21 08:00:31,169 INFO sqlalchemy.engine.Engine [cached since 0.004766s ago] {'family_id': 2, 'parent_id': None}
2025-01-21 08:00:31,169 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) VALUES (%(family_id)s, %(parent_id)s) RETURNING persons.id
2025-01-21 08:00:31,169 INFO sqlalchemy.engine.Engine [cached since 0.0051s ago] {'family_id': 2, 'parent_id': 9}
2025-01-21 08:00:31,169 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) VALUES (%(family_id)s, %(parent_id)s) RETURNING persons.id
2025-01-21 08:00:31,169 INFO sqlalchemy.engine.Engine [cached since 0.00541s ago] {'family_id': 2, 'parent_id': 10}
2025-01-21 08:00:31,170 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) SELECT p0::INTEGER, p1::INTEGER FROM (VALUES (%(family_id__0)s, %(parent_id__0)s, 0), (%(family_id__1)s, %(parent_id__1)s, 1), (%(family_id__2)s, %(parent_id__2)s, 2), (%(family_id__3)s, %(parent_id__3)s, 3), (%(family_id__4)s, %(parent_id__4)s, 4)) AS imp_sen(p0, p1, sen_counter) ORDER BY sen_counter RETURNING persons.id, persons.id AS id__1
2025-01-21 08:00:31,170 INFO sqlalchemy.engine.Engine [cached since 0.004123s ago (insertmanyvalues) 1/1 (ordered)] {'family_id__0': 2, 'parent_id__0': 11, 'family_id__1': 2, 'parent_id__1': 11, 'family_id__2': 2, 'parent_id__2': 11, 'family_id__3': 2, 'parent_id__3': 11, 'family_id__4': 2, 'parent_id__4': 11}
2025-01-21 08:00:31,170 INFO sqlalchemy.engine.Engine SELECT persons.id AS persons_id, persons.family_id AS persons_family_id, persons.parent_id AS persons_parent_id
FROM persons
WHERE %(param_1)s = persons.family_id
2025-01-21 08:00:31,170 INFO sqlalchemy.engine.Engine [cached since 0.003335s ago] {'param_1': 2}
2025-01-21 08:00:31,172 INFO sqlalchemy.engine.Engine SELECT families.id, families.address
FROM families
2025-01-21 08:00:31,172 INFO sqlalchemy.engine.Engine [generated in 0.00007s] {}
2025-01-21 08:00:31,173 INFO sqlalchemy.engine.Engine SELECT persons.family_id AS persons_family_id, persons.id AS persons_id, persons.parent_id AS persons_parent_id
FROM persons
WHERE persons.family_id IN (%(primary_keys_1)s, %(primary_keys_2)s)
2025-01-21 08:00:31,173 INFO sqlalchemy.engine.Engine [generated in 0.00011s] {'primary_keys_1': 1, 'primary_keys_2': 2}
2025-01-21 08:00:31,174 INFO sqlalchemy.engine.Engine SELECT persons.parent_id AS persons_parent_id, persons.id AS persons_id, persons.family_id AS persons_family_id
FROM persons
WHERE persons.parent_id IN (%(primary_keys_1)s, %(primary_keys_2)s, %(primary_keys_3)s, %(primary_keys_4)s, %(primary_keys_5)s, %(primary_keys_6)s, %(primary_keys_7)s, %(primary_keys_8)s, %(primary_keys_9)s, %(primary_keys_10)s, %(primary_keys_11)s, %(primary_keys_12)s, %(primary_keys_13)s, %(primary_keys_14)s, %(primary_keys_15)s, %(primary_keys_16)s)
2025-01-21 08:00:31,174 INFO sqlalchemy.engine.Engine [generated in 0.00010s] {'primary_keys_1': 1, 'primary_keys_2': 2, 'primary_keys_3': 3, 'primary_keys_4': 4, 'primary_keys_5': 5, 'primary_keys_6': 6, 'primary_keys_7': 7, 'primary_keys_8': 8, 'primary_keys_9': 9, 'primary_keys_10': 10, 'primary_keys_11': 11, 'primary_keys_12': 12, 'primary_keys_13': 13, 'primary_keys_14': 14, 'primary_keys_15': 15, 'primary_keys_16': 16}
2025-01-21 08:00:31,176 INFO sqlalchemy.engine.Engine INSERT INTO persons (family_id, parent_id) SELECT p0::INTEGER, p1::INTEGER FROM (VALUES (%(family_id__0)s, %(parent_id__0)s, 0), (%(family_id__1)s, %(parent_id__1)s, 1)) AS imp_sen(p0, p1, sen_counter) ORDER BY sen_counter RETURNING persons.id, persons.id AS id__1
2025-01-21 08:00:31,176 INFO sqlalchemy.engine.Engine [cached since 0.0101s ago (insertmanyvalues) 1/1 (ordered)] {'family_id__0': 1, 'parent_id__0': 2, 'family_id__1': 2, 'parent_id__1': 10}
2025-01-21 08:00:31,176 INFO sqlalchemy.engine.Engine UPDATE persons SET parent_id=%(parent_id)s WHERE persons.id = %(persons_id)s
2025-01-21 08:00:31,177 INFO sqlalchemy.engine.Engine [generated in 0.00008s] [{'parent_id': 17, 'persons_id': 4}, {'parent_id': 18, 'persons_id': 12}]
2025-01-21 08:00:31,177 INFO sqlalchemy.engine.Engine SELECT families.id, families.address
FROM families
2025-01-21 08:00:31,177 INFO sqlalchemy.engine.Engine [cached since 0.005307s ago] {}
2025-01-21 08:00:31,178 INFO sqlalchemy.engine.Engine SELECT persons.family_id AS persons_family_id, persons.id AS persons_id, persons.parent_id AS persons_parent_id
FROM persons
WHERE persons.family_id IN (%(primary_keys_1)s, %(primary_keys_2)s)
2025-01-21 08:00:31,178 INFO sqlalchemy.engine.Engine [cached since 0.004792s ago] {'primary_keys_1': 1, 'primary_keys_2': 2}
2025-01-21 08:00:31,178 INFO sqlalchemy.engine.Engine SELECT persons.parent_id AS persons_parent_id, persons.id AS persons_id, persons.family_id AS persons_family_id
FROM persons
WHERE persons.parent_id IN (%(primary_keys_1)s, %(primary_keys_2)s, %(primary_keys_3)s, %(primary_keys_4)s, %(primary_keys_5)s, %(primary_keys_6)s, %(primary_keys_7)s, %(primary_keys_8)s, %(primary_keys_9)s, %(primary_keys_10)s, %(primary_keys_11)s, %(primary_keys_12)s, %(primary_keys_13)s, %(primary_keys_14)s, %(primary_keys_15)s, %(primary_keys_16)s, %(primary_keys_17)s, %(primary_keys_18)s)
2025-01-21 08:00:31,178 INFO sqlalchemy.engine.Engine [cached since 0.004449s ago] {'primary_keys_1': 1, 'primary_keys_2': 2, 'primary_keys_3': 3, 'primary_keys_4': 5, 'primary_keys_5': 6, 'primary_keys_6': 7, 'primary_keys_7': 8, 'primary_keys_8': 9, 'primary_keys_9': 10, 'primary_keys_10': 11, 'primary_keys_11': 13, 'primary_keys_12': 14, 'primary_keys_13': 15, 'primary_keys_14': 16, 'primary_keys_15': 17, 'primary_keys_16': 18, 'primary_keys_17': 4, 'primary_keys_18': 12}
Persons residing at 100 Main St
====================
1
2
17
4
3
8
7
6
5
Persons residing at 101 Main St
====================
9
10
18
12
11
16
15
14
13
2025-01-21 08:00:31,179 INFO sqlalchemy.engine.Engine COMMIT
merge
? Instead of add or just working within the session. – Ian Wilson Commented Jan 20 at 17:49add()
. Maybe explain the use-case formerge()
and I could try to update the example but it is hard to get right. – Ian Wilson Commented Jan 20 at 21:57