最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

MongoDB duplicate results issue - Stack Overflow

programmeradmin2浏览0评论

i'm working on a project that uses a mongodb query to get results and handle pagination and result limiation. The issue is i'm starting to see duplicate results for every page.. I'm not sure what's the issue.

Here is my query:

async function getAllResults(uid, jobId, page = 1, limit = 10, sortField = 'id', sortOrder = 'asc') {
  const db = await connectToDatabase();
  const resultsCollection = db.collection('results');

  const totalCount = await resultsCollection.aggregate([
    { $match: { userId: uid, jobId: jobId } },
    { $unwind: "$results" },
    { $count: "total" }
  ]).toArray();

  const total = totalCount[0]?.total || 0;

  const sortObj = {};
  
  
    sortObj[`results.${sortField}`] = sortOrder === 'asc' ? 1 : -1;

  const pipeline = [
    { $match: { userId: uid, jobId: jobId } },
    { $unwind: "$results" },
    { $sort: sortObj },
  ];

  if (limit > 0) {
    const skip = (page - 1) * limit;
    pipeline.push({ $skip: skip });
    pipeline.push({ $limit: limit });
  }

  pipeline.push({
    $group: {
      _id: null,
      results: { $push: "$results" }
    }
  });

  const jobResults = await resultsCollection.aggregate(pipeline).toArray();

  if (jobResults.length === 0) {
    return { results: [], total: 0, pages: 0 };
  }

  return {
    results: jobResults[0].results,
    total: total,
    pages: limit > 0 ? Math.ceil(total / limit) : 1
  };
}

I tried so many solutions using Ai but none works. I know the issue might be with unwind.. there was a solution that worked but it required handling results in memory which might cause me some performance issues if the results were large (10k up to 500k).

This is an example docuemnt for a result. Each JSON object in results have data.

{
  "_id": "6735781e6707ab39c1f694e5",
  "userId": "OJctGjbmxJN7iLhIKIXeItYd0BT2",
  "jobId": "68c96821-d601-4c77-b9eb-b2df1e09a838",
  "documentId": "68c96821-d601-4c77-b9eb-b2df1e09a838-0",
  "type": "email",
  "date": "2024-11-14T04:10:06.440+00:00",
  "results": [
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}
  ]
}

And this is in the returned response:

{
    "results": [
        {
            "input": "...",
            "success": true,
            "avatar": "...",
            "city": "...",
            "country": "...",
            "email": "..."
        },
        {
            ...
        }
    ],
    "total": 100,
    "pages": 10
}

i'm working on a project that uses a mongodb query to get results and handle pagination and result limiation. The issue is i'm starting to see duplicate results for every page.. I'm not sure what's the issue.

Here is my query:

async function getAllResults(uid, jobId, page = 1, limit = 10, sortField = 'id', sortOrder = 'asc') {
  const db = await connectToDatabase();
  const resultsCollection = db.collection('results');

  const totalCount = await resultsCollection.aggregate([
    { $match: { userId: uid, jobId: jobId } },
    { $unwind: "$results" },
    { $count: "total" }
  ]).toArray();

  const total = totalCount[0]?.total || 0;

  const sortObj = {};
  
  
    sortObj[`results.${sortField}`] = sortOrder === 'asc' ? 1 : -1;

  const pipeline = [
    { $match: { userId: uid, jobId: jobId } },
    { $unwind: "$results" },
    { $sort: sortObj },
  ];

  if (limit > 0) {
    const skip = (page - 1) * limit;
    pipeline.push({ $skip: skip });
    pipeline.push({ $limit: limit });
  }

  pipeline.push({
    $group: {
      _id: null,
      results: { $push: "$results" }
    }
  });

  const jobResults = await resultsCollection.aggregate(pipeline).toArray();

  if (jobResults.length === 0) {
    return { results: [], total: 0, pages: 0 };
  }

  return {
    results: jobResults[0].results,
    total: total,
    pages: limit > 0 ? Math.ceil(total / limit) : 1
  };
}

I tried so many solutions using Ai but none works. I know the issue might be with unwind.. there was a solution that worked but it required handling results in memory which might cause me some performance issues if the results were large (10k up to 500k).

This is an example docuemnt for a result. Each JSON object in results have data.

{
  "_id": "6735781e6707ab39c1f694e5",
  "userId": "OJctGjbmxJN7iLhIKIXeItYd0BT2",
  "jobId": "68c96821-d601-4c77-b9eb-b2df1e09a838",
  "documentId": "68c96821-d601-4c77-b9eb-b2df1e09a838-0",
  "type": "email",
  "date": "2024-11-14T04:10:06.440+00:00",
  "results": [
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}
  ]
}

And this is in the returned response:

{
    "results": [
        {
            "input": "...",
            "success": true,
            "avatar": "...",
            "city": "...",
            "country": "...",
            "email": "..."
        },
        {
            ...
        }
    ],
    "total": 100,
    "pages": 10
}
Share Improve this question edited Nov 16, 2024 at 15:42 Joint asked Nov 16, 2024 at 5:13 JointJoint 11 silver badge1 bronze badge 1
  • 1 Show some example documents. Your default sort field is id - rather than _id --> so results.id. Does this field actually exist? Is it unique? If not, then add a second sort condition. Otherwise, it will sort on that one field and documents which have same value for that field will be sorted in an undetermined sequence / randomly. Add this stage to the end and make sure all the counts are '1': { $group: { _id: "results.id", count: { $count: {} } } }. – aneroid Commented Nov 16, 2024 at 5:38
Add a comment  | 

1 Answer 1

Reset to default 1

You don't need $unwind and $count, just use

{ $set: {total: {$size: "$results"}} }

Instead of $unwind, $skip, $limit and $group you can use $slice

You can do it in one query, something like this (not tested):

const jobResults= await resultsCollection.aggregate([
    { $match: { userId: uid, jobId: jobId } },
    { $set: {
       total: {$size: "$results"},
       results: { $slice: [ "$results", skip, limit ] }
       }
    }
  ]).toArray();

return {
    results: jobResults[0].results,
    total: jobResults[0].total,
    pages: limit > 0 ? Math.ceil(total / limit) : 1
  };

You may use

results: {
   $sortArray: {
      input: { $slice: [ "$results", skip, limit ] }, 
      sortBy: sortObj }
}

to get an ordered result

发布评论

评论列表(0)

  1. 暂无评论