I'm using StringAgg
and order
as follows:
# Get order column & annotate with list of credits
if request.POST.get('order[0][name]'):
order = request.POST['order[0][name]']
if order == 'credits_primary':
releases = releases.annotate(credits_primary=StringAgg(
'credits__entity__name',
delimiter=', ',
filter=Q(credits__type='primary'),
ordering='credits__id'
))
elif order == 'credits_secondary':
releases = releases.annotate(credits_secondary=StringAgg(
'credits__entity__name',
delimiter=', ',
filter=Q(credits__type='secondary'),
ordering='credits__id'
))
else:
order = 'title'
# Order releases
if request.POST.get('order[0][dir]') == 'desc':
releases = releases.order_by(F(order).desc(nulls_last=True), 'title')
else:
releases = releases.order_by(F(order).asc(nulls_last=True), 'title')
for release in releases:
try: print(release.credits_primary)
except: pass
try: print(release.credits_secondary)
except: pass
This in itself works exactly as expected: the ordering is what I expect, and print
returns the values I expect.
However, when I apply a filter
before this, where the result includes multiple instances of the same release
, the credits__entity__name
s will be repeated or omitted for that release, depending on how many results there are for the field(s) being filtered by, despite using distinct
.
Below is the filter
I'm applying that affects the aggregation (applied before StringAgg
):
# Other filters...
# Filter by Search
if request.POST.get('search[value]'):
query = Q()
search = request.POST['search[value]']
query.add(Q(title__icontains=search), Q.OR)
query.add(Q(tags__tag__name__icontains=search), Q.OR)
query.add(Q(credits__entity__name__icontains=search), Q.OR)
releases = releases.filter(query)
# Make sure items aren't repeated
releases = releases.distinct()
I'm using StringAgg
and order
as follows:
# Get order column & annotate with list of credits
if request.POST.get('order[0][name]'):
order = request.POST['order[0][name]']
if order == 'credits_primary':
releases = releases.annotate(credits_primary=StringAgg(
'credits__entity__name',
delimiter=', ',
filter=Q(credits__type='primary'),
ordering='credits__id'
))
elif order == 'credits_secondary':
releases = releases.annotate(credits_secondary=StringAgg(
'credits__entity__name',
delimiter=', ',
filter=Q(credits__type='secondary'),
ordering='credits__id'
))
else:
order = 'title'
# Order releases
if request.POST.get('order[0][dir]') == 'desc':
releases = releases.order_by(F(order).desc(nulls_last=True), 'title')
else:
releases = releases.order_by(F(order).asc(nulls_last=True), 'title')
for release in releases:
try: print(release.credits_primary)
except: pass
try: print(release.credits_secondary)
except: pass
This in itself works exactly as expected: the ordering is what I expect, and print
returns the values I expect.
However, when I apply a filter
before this, where the result includes multiple instances of the same release
, the credits__entity__name
s will be repeated or omitted for that release, depending on how many results there are for the field(s) being filtered by, despite using distinct
.
Below is the filter
I'm applying that affects the aggregation (applied before StringAgg
):
# Other filters...
# Filter by Search
if request.POST.get('search[value]'):
query = Q()
search = request.POST['search[value]']
query.add(Q(title__icontains=search), Q.OR)
query.add(Q(tags__tag__name__icontains=search), Q.OR)
query.add(Q(credits__entity__name__icontains=search), Q.OR)
releases = releases.filter(query)
# Make sure items aren't repeated
releases = releases.distinct()
Share
Improve this question
edited Mar 30 at 11:18
bur
asked Mar 29 at 17:03
burbur
8098 silver badges22 bronze badges
2 Answers
Reset to default 2Below is the filter I'm applying that affects the aggregation (applied before StringAgg):
The filter looks at (other) one-to-many or many-to-many relations. This means that this will result in LEFT OUTER JOIN
that will start to act as a multiplier. The same works with other aggregates such as SUM
and COUNT
.
Indeed, if you just aggregate, the query looks like:
SELECT string_agg(CASE
WHEN c.type = 'primary' THEN e.name
ELSE NULL
END
ORDER BY e.id)
FROM releases AS r
LEFT OUTER JOIN credits AS c ON c.release_id = r.id
LEFT OUTER JOIN entity AS e ON e.credit_id = e.id
and that is fine, because we want to aggregate over the entities, so per release we get all entities.
But now if we for example filter on the tags, it looks like:
SELECT string_agg(CASE
WHEN c.type = 'primary' THEN e.name
ELSE NULL
END
ORDER BY e.id)
FROM releases AS r
LEFT OUTER JOIN credits AS c ON c.release_id = r.id
LEFT OUTER JOIN entity AS e ON e.credit_id = e.id
LEFT OUTER JOIN release_tag rt1 ON rt1.release_id = r.id
LEFT OUTER JOIN tag t1 ON rt1.tag_id = t1.id
LEFT OUTER JOIN release_tag rt2 ON rt2.release_id = r.id
LEFT OUTER JOIN tag t1 ON rt2.tag_id = t2.id
WHERE t1.name = 'tag1'
OR t2.name = 'tag2'
This means that the entity names for a release where one of the two tags match will be repeated once, but where both match, it will be included twice.
The Q(credits__entity__name__icontains)
makes it even worse: since this will join on the table c
, and thus entities of credits
that have two or more matches will be included that many times, whereas the entities of a release that does not match with a given credit, will no longer be included.
The least common divider of all these phenomena is probably that you should be very careful to make aggregates when you make JOIN
s especially if multiple tables on which the JOINs are caried our are involved.
What you probably can do is make a Subquery
expression [Django-doc] which is thus not impacted by the "outer" tables and the corresponding JOIN
.
Not sure I understand your use case, but if the problem is that you get duplicates in your aggregation, maybe you can try to use distinct
on your string aggregations:
https://docs.djangoproject/fr/5.1/ref/contrib/postgres/aggregates/#stringagg