python - How to efficiently exclude already assigned objects in a Django QuerySet?

I am working on a Django project where I need to filter objects based on their status while excluding those that are already assigned in another model.

I have two models:

CartObject – Stores all objects.
OnGoingProcess – Tracks objects that are currently assigned.

Each OnGoingProcess entry has a OneToOneField relationship with CartObject, meaning each object can only be assigned once.

My goal is to fetch all objects with a specific status but exclude those that are already assigned in OnGoingProcess.

Models:

class CartObject(models.Model):
    object_id = models.CharField(max_length=100, unique=True)
    status = models.CharField(max_length=50, choices=[("pending", "Pending")])
    # Other fields...

class OnGoingProcess(models.Model):
    user = models.OneToOneField(DeliveryProfile, on_delete=models.CASCADE, related_name="ongoing_process")
    associated_object = models.OneToOneField(CartObject, on_delete=models.CASCADE, related_name="associated_process", blank=True, null=True)
    # Other fields...

Current View Code:

@user_passes_test(lambda user: user.is_staff)
def process_manager_view(request):
    # Get objects that are already assigned in OnGoingProcess
    assigned_objects = OnGoingProcess.objects.values_list('associated_object', flat=True)

    # Exclude objects that are already assigned
    available_objects = CartObject.objects.filter(status="pending").exclude(id__in=assigned_objects).order_by("-id")

    context = {
        "available_objects": available_objects,
    }
    return render(request, "useradmin/available-objects.html", context)

Issue:

I am using values_list('associated_object', flat=True) to extract the assigned object IDs.
Then, I am using exclude(id__in=assigned_objects) to filter out those objects.
Is this the most efficient way? Or is there a better Django ORM method to achieve the same result?
Should I use Subquery(), isnull=False, or any other approach for better performance?

Alternative Solutions I Considered:

Option 1: Using `isnull=False`

available_objects = CartObject.objects.filter(status="pending").exclude(associated_process__isnull=False)

Pros: Simple, avoids extra queries.
Cons: Not sure if it's the best approach for performance.

Option 2: Using `Subquery`

from django.db.models import Subquery

assigned_objects = OnGoingProcess.objects.values('associated_object')
available_objects = CartObject.objects.filter(status="pending").exclude(id__in=Subquery(assigned_objects))

Pros: Optimized for large datasets.
Cons: More complex.

Option 3: Using Raw SQL (if necessary)

from django.db import connection

with connection.cursor() as cursor:
    cursor.execute("""
        SELECT id FROM useradmin_cartobject 
        WHERE status='pending' 
        AND id NOT IN (SELECT associated_object FROM useradmin_ongoingprocess)
        ORDER BY id DESC
    """)
    result = cursor.fetchall()

available_objects = CartObject.objects.filter(id__in=[row[0] for row in result])

Pros: Performance boost for huge data.
Cons: Less readable, database-dependent.

Question:

What is the best and most efficient Django ORM approach to filter objects while excluding those that are already assigned?
Would values_list(), isnull=False, or Subquery() be the recommended way?
Are there any performance considerations when working with large datasets?

Thank you in advance!

I am working on a Django project where I need to filter objects based on their status while excluding those that are already assigned in another model.

I have two models:

CartObject – Stores all objects.
OnGoingProcess – Tracks objects that are currently assigned.

Each OnGoingProcess entry has a OneToOneField relationship with CartObject, meaning each object can only be assigned once.

My goal is to fetch all objects with a specific status but exclude those that are already assigned in OnGoingProcess.

Models:

class CartObject(models.Model):
    object_id = models.CharField(max_length=100, unique=True)
    status = models.CharField(max_length=50, choices=[("pending", "Pending")])
    # Other fields...

class OnGoingProcess(models.Model):
    user = models.OneToOneField(DeliveryProfile, on_delete=models.CASCADE, related_name="ongoing_process")
    associated_object = models.OneToOneField(CartObject, on_delete=models.CASCADE, related_name="associated_process", blank=True, null=True)
    # Other fields...

Current View Code:

@user_passes_test(lambda user: user.is_staff)
def process_manager_view(request):
    # Get objects that are already assigned in OnGoingProcess
    assigned_objects = OnGoingProcess.objects.values_list('associated_object', flat=True)

    # Exclude objects that are already assigned
    available_objects = CartObject.objects.filter(status="pending").exclude(id__in=assigned_objects).order_by("-id")

    context = {
        "available_objects": available_objects,
    }
    return render(request, "useradmin/available-objects.html", context)

Issue:

I am using values_list('associated_object', flat=True) to extract the assigned object IDs.
Then, I am using exclude(id__in=assigned_objects) to filter out those objects.
Is this the most efficient way? Or is there a better Django ORM method to achieve the same result?
Should I use Subquery(), isnull=False, or any other approach for better performance?

Alternative Solutions I Considered:

Option 1: Using `isnull=False`

available_objects = CartObject.objects.filter(status="pending").exclude(associated_process__isnull=False)

Pros: Simple, avoids extra queries.
Cons: Not sure if it's the best approach for performance.

Option 2: Using `Subquery`

from django.db.models import Subquery

assigned_objects = OnGoingProcess.objects.values('associated_object')
available_objects = CartObject.objects.filter(status="pending").exclude(id__in=Subquery(assigned_objects))

Pros: Optimized for large datasets.
Cons: More complex.

Option 3: Using Raw SQL (if necessary)

from django.db import connection

with connection.cursor() as cursor:
    cursor.execute("""
        SELECT id FROM useradmin_cartobject 
        WHERE status='pending' 
        AND id NOT IN (SELECT associated_object FROM useradmin_ongoingprocess)
        ORDER BY id DESC
    """)
    result = cursor.fetchall()

available_objects = CartObject.objects.filter(id__in=[row[0] for row in result])

Pros: Performance boost for huge data.
Cons: Less readable, database-dependent.

Question:

Thank you in advance!

Share Improve this question edited Mar 24 at 6:22 asked Mar 2 at 14:10 Google User 1751 silver badge8 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 5

Your query:

available_objects = (
    CartObject.objects.filter(status='pending')
    .exclude(id__in=assigned_objects)
    .order_by('-id')
)

will use a subquery, so run as:

AND id NOT IN (SELECT associated_object FROM useradmin_ongoingprocess)

You can inspect it with:

print(available_objects.query)

But on databases like MySQL, this is not the most efficient one no.

For most databases:

available_objects = CartObject.objects.filter(status='pending').filter(
    associated_process=None
)

will yield results efficiently. We can rewrite this to:

available_objects = CartObject.objects.filter(
    status='pending', associated_process=None
)

this will generate the same query, but is a bit shorter in code.

This works based on the fact that a LEFT OUTER JOIN includes a NULL row for items for which there is no corresponding item at the table for the OnGoingProcess model. Then we thus retrieve only the ones with NULL, so only retain the CartObject with no OnGoingProcess.

As for raw queries, one usually uses this if there is no ORM-equivalent available, or when it is very cumbersome to construct it. Raw queries have additional disadvantages: a QuerySet can be filtered, paginated, etc. whereas a raw query is an "end product": you can not manipulate it further, so limiting what you can do with it with Django. It also means that if you change the corresponding models, like adding a column, you will probably have to rewrite the raw queries that work with these models as well.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - How to efficiently exclude already assigned objects in a Django QuerySet? - Stack Overflow

Models:

Current View Code:

Issue:

Alternative Solutions I Considered:

Option 1: Using `isnull=False`

Option 2: Using `Subquery`

Option 3: Using Raw SQL (if necessary)

Question:

Models:

Current View Code:

Issue:

Alternative Solutions I Considered:

Option 1: Using `isnull=False`

Option 2: Using `Subquery`

Option 3: Using Raw SQL (if necessary)

Question:

1 Answer 1

与本文相关的文章

评论列表(0)

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Models:

Current View Code:

Issue:

Alternative Solutions I Considered:

Option 1: Using isnull=False

Option 2: Using Subquery

Option 3: Using Raw SQL (if necessary)

Question:

Models:

Current View Code:

Issue:

Alternative Solutions I Considered:

Option 1: Using isnull=False

Option 2: Using Subquery

Option 3: Using Raw SQL (if necessary)

Question:

1 Answer 1

与本文相关的文章

评论列表(0)

Option 1: Using `isnull=False`

Option 2: Using `Subquery`

Option 1: Using `isnull=False`

Option 2: Using `Subquery`