最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - How to find parents based on child fields in mongo using aggregation? - Stack Overflow

programmeradmin1浏览0评论

Here is a code I have:

const _ = require('lodash')
const Box = require('./models/Box')

const boxesToBePicked = await Box.find({ status: 'ready', client: 27 })
const boxesOriginalIds = _(boxesToBePicked).map('original')pact().uniq().value()
const boxesOriginal = boxesOriginalIds.length ? await Box.find({ _id: { $in: boxesOriginalIds } }) : []

const attributes = ['name']

const boxes = [
  ...boxesOriginal,
  ...boxesToBePicked.filter(box => !box.original)
].map(box => _.pick(box, attributes))

Let's say, we have following data in "boxes" collection:

[
  { _id: 1, name: 'Original Box #1', status: 'pending' },
  { _id: 2, name: 'Nested box', status: 'ready', original: 1 },
  { _id: 3, name: 'Nested box', status: 'ready', original: 1 },
  { _id: 4, name: 'Nested box', status: 'pending', original: 1 },
  { _id: 5, name: 'Original Box #2', status: 'ready' },
  { _id: 6, name: 'Original Box #3', status: 'pending' },
  { _id: 7, name: 'Nested box', status: 'ready', original: 6 },
  { _id: 8, name: 'Original Box #4', status: 'pending' }
]

Workflow

Find all boxes, which are ready to be picked:

const boxesToBePicked = await Box.find({ status: 'ready' })

// Returns:

[
  { _id: 2, name: 'Nested box', status: 'ready', original: 1 },
  { _id: 3, name: 'Nested box', status: 'ready', original: 1 },
  { _id: 5, name: 'Original Box #2', status: 'ready' },
  { _id: 7, name: 'Nested box', status: 'ready', original: 6 }
]

Get all the IDs of original (parent) boxes of those:

const boxesOriginalIds = _(boxesToBePicked).map('original')pact().uniq().value()

// Returns:

[1, 6]

Get those boxes by their IDs:

const boxesOriginal = boxesOriginalIds.length ? await Box.find({ _id: { $in: boxesOriginalIds } }) : []

// Returns

[
  { _id: 1, name: 'Original Box #1', status: 'pending' },
  { _id: 6, name: 'Original Box #3', status: 'pending' }
]

Join those boxes with not nested boxes to be picked:

const boxes = [
  ...boxesOriginal,
  ...boxesToBePicked.filter(box => !box.original)
].map(box => _.pick(box, attributes))

// Returns

[
  { name: 'Original Box #1' },
  { name: 'Original Box #3' },
  { name: 'Original Box #2' }
]

So basically what we are doing here is getting all the original boxes if they have at least one nested box with status "ready", and all not nested boxes with status "ready".

I think it can be simplified by using aggregation pipeline and projection. But how?

Here is a code I have:

const _ = require('lodash')
const Box = require('./models/Box')

const boxesToBePicked = await Box.find({ status: 'ready', client: 27 })
const boxesOriginalIds = _(boxesToBePicked).map('original').pact().uniq().value()
const boxesOriginal = boxesOriginalIds.length ? await Box.find({ _id: { $in: boxesOriginalIds } }) : []

const attributes = ['name']

const boxes = [
  ...boxesOriginal,
  ...boxesToBePicked.filter(box => !box.original)
].map(box => _.pick(box, attributes))

Let's say, we have following data in "boxes" collection:

[
  { _id: 1, name: 'Original Box #1', status: 'pending' },
  { _id: 2, name: 'Nested box', status: 'ready', original: 1 },
  { _id: 3, name: 'Nested box', status: 'ready', original: 1 },
  { _id: 4, name: 'Nested box', status: 'pending', original: 1 },
  { _id: 5, name: 'Original Box #2', status: 'ready' },
  { _id: 6, name: 'Original Box #3', status: 'pending' },
  { _id: 7, name: 'Nested box', status: 'ready', original: 6 },
  { _id: 8, name: 'Original Box #4', status: 'pending' }
]

Workflow

Find all boxes, which are ready to be picked:

const boxesToBePicked = await Box.find({ status: 'ready' })

// Returns:

[
  { _id: 2, name: 'Nested box', status: 'ready', original: 1 },
  { _id: 3, name: 'Nested box', status: 'ready', original: 1 },
  { _id: 5, name: 'Original Box #2', status: 'ready' },
  { _id: 7, name: 'Nested box', status: 'ready', original: 6 }
]

Get all the IDs of original (parent) boxes of those:

const boxesOriginalIds = _(boxesToBePicked).map('original').pact().uniq().value()

// Returns:

[1, 6]

Get those boxes by their IDs:

const boxesOriginal = boxesOriginalIds.length ? await Box.find({ _id: { $in: boxesOriginalIds } }) : []

// Returns

[
  { _id: 1, name: 'Original Box #1', status: 'pending' },
  { _id: 6, name: 'Original Box #3', status: 'pending' }
]

Join those boxes with not nested boxes to be picked:

const boxes = [
  ...boxesOriginal,
  ...boxesToBePicked.filter(box => !box.original)
].map(box => _.pick(box, attributes))

// Returns

[
  { name: 'Original Box #1' },
  { name: 'Original Box #3' },
  { name: 'Original Box #2' }
]

So basically what we are doing here is getting all the original boxes if they have at least one nested box with status "ready", and all not nested boxes with status "ready".

I think it can be simplified by using aggregation pipeline and projection. But how?

Share Improve this question edited Feb 26, 2017 at 15:30 Bertrand Martel 45.5k17 gold badges150 silver badges168 bronze badges asked Dec 22, 2016 at 20:55 NazarNazar 1,7991 gold badge16 silver badges31 bronze badges
Add a ment  | 

5 Answers 5

Reset to default 2

You can try something like below. Uses $lookUp to self join to collection and $match stage with $or in bination with $and for second condition and the next part of $or for first condition and $group stage to remove duplicates and $project stage to format the response.

db.boxes.aggregate([{
    $lookup: {
        from: "boxes",
        localField: "original",
        foreignField: "_id",
        as: "nested_orders"
    }
}, {
    $unwind: {
        path: "$nested_orders",
        preserveNullAndEmptyArrays: true
    }
}, {
    $match: {
        $or: [{
            $and: [{
                "status": "ready"
            }, {
                "nested_orders": {
                    $exists: false,
                }
            }]
        }, {
            "nested_orders.status": "pending"
        }]
    }
}, {
    $group: {
        "_id": null,
        "names": {
            $addToSet: {
                name: "$name",
                nested_name: "$nested_orders.name"
            }
        }
    }
}, {
    $unwind: "$names"
}, {
    $project: {
        "_id": 0,
        "name": {
            $ifNull: ['$names.nested_name', '$names.name']
        }
    }
}]).pretty();

Sample Response

{ "name" : "Original Box #1" }
{ "name" : "Original Box #2" }
{ "name" : "Original Box #3" }

To depose the aggregation :

  • a $group which creates

    • an array ids which match ready status for which it will add the *original value
    • an array box_ready which match ready status and keep the other fields as is (it will be used later)
    • an array document which contain the whole original document ($$ROOT)

      {
          $group: {
              _id: null,
              ids: {
                  $addToSet: {
                      $cond: [
                          { $eq: ["$status", "ready"] },
                          "$original", null
                      ]
                  }
              },
              box_ready: {
                  $addToSet: {
                      $cond: [
                          { $eq: ["$status", "ready"] },
                          { _id: "$_id", name: "$name", original: "$original", status: "$status" },
                          null
                      ]
                  }
              },
              document: { $push: "$$ROOT" }
          }
      }
      
  • $unwind document field to remove the array

    {
        $unwind: "$document"
    }
    
  • use a $redact aggregation to keep or remove records based on matching of $document._id in the array ids previously created (that contain the matching original and status)

    {
        $redact: {
            "$cond": {
                "if": {
                    "$setIsSubset": [{
                            "$map": {
                                "input": { "$literal": ["A"] },
                                "as": "a",
                                "in": "$document._id"
                            }
                        },
                        "$ids"
                    ]
                },
                "then": "$$KEEP",
                "else": "$$PRUNE"
            }
        }
    }
    
  • $group to push all documents that matched the previous $redact to another array named filtered (we have now 2 array which can be united)

    {
        $group: {
            _id: null,
            box_ready: { $first: "$box_ready" },
            filtered: { $push: "$document" }
        }
    }
    
  • use a $project with a setUnion to union the arrays box_ready and filtered

    {
        $project: {
            union: {
                $setUnion: ["$box_ready", "$filtered"]
            },
            _id: 0
        }
    }
    
  • $unwind the array you have obtained to get distinct records

    {
        $unwind: "$union"
    }
    
  • $match only those which have original missing and that are not null (as initially a the status:ready condition has obliged to get a null value on the first $group

    {
        $match: {
            "union.original": {
                "$exists": false
            },
            "union": { $nin: [null] }
        }
    }
    

The whole aggregation query is :

db.collection.aggregate(
    [{
        $group: {
            _id: null,
            ids: {
                $addToSet: {
                    $cond: [
                        { $eq: ["$status", "ready"] },
                        "$original", null
                    ]
                }
            },
            box_ready: {
                $addToSet: {
                    $cond: [
                        { $eq: ["$status", "ready"] },
                        { _id: "$_id", name: "$name", original: "$original", status: "$status" },
                        null
                    ]
                }
            },
            document: { $push: "$$ROOT" }
        }
    }, {
        $unwind: "$document"
    }, {
        $redact: {
            "$cond": {
                "if": {
                    "$setIsSubset": [{
                            "$map": {
                                "input": { "$literal": ["A"] },
                                "as": "a",
                                "in": "$document._id"
                            }
                        },
                        "$ids"
                    ]
                },
                "then": "$$KEEP",
                "else": "$$PRUNE"
            }
        }
    }, {

        $group: {
            _id: null,
            box_ready: { $first: "$box_ready" },
            filtered: { $push: "$document" }
        }

    }, {
        $project: {
            union: {
                $setUnion: ["$box_ready", "$filtered"]
            },
            _id: 0
        }
    }, {
        $unwind: "$union"
    }, {
        $match: {
            "union.original": {
                "$exists": false
            },
            "union": { $nin: [null] }
        }
    }]
)

It gives you :

{ "union" : { "_id" : 1, "name" : "Original Box #1", "status" : "pending" } }
{ "union" : { "_id" : 5, "name" : "Original Box #2", "status" : "ready" } }
{ "union" : { "_id" : 6, "name" : "Original Box #3", "status" : "pending" } }

Use an additional $project if you want to select specific fields

For mongoose, you should be able to do like this to perform aggregation :

Box.aggregate([
    //the whole aggregation here
], function(err, result) {

});

Several of the answers are close but here's the most efficient way. It accumulates the "_id" values of boxes to be picked up and then uses $lookup to "rehydrate" the full details of each (top-level) box.

db.boxes.aggregate(
    {$group: {
         _id:null, 
         boxes:{$addToSet:{$cond:{
            if:{$eq:["$status","ready"]},
            then:{$ifNull:["$original","$_id"]},
            else:null
         }}}
    }},

    {$lookup: {
          from:"boxes",
          localField:"boxes",
          foreignField:"_id",
          as:"boxes"
    }}
)

Your result based on sample data:

{
"_id" : null,
"boxIdsToPickUp" : [
    {
        "_id" : 1,
        "name" : "Original Box #1",
        "status" : "pending"
    },
    {
        "_id" : 5,
        "name" : "Original Box #2",
        "status" : "ready"
    },
    {
        "_id" : 6,
        "name" : "Original Box #3",
        "status" : "pending"
    }
] }

Note that the $lookup is done only for the _id values of boxes to be picked up which is far more efficient than doing it for all boxes.

If you wanted the pipeline to be more efficient you would need to store more details about original box in the nested box documents (like its name).

To achieve your goal you can follow bellow steps:

  1. First of all select record for status is ready (because you want to get parent who has no nested box but status is ready and who has nested box at least one with stats is ready )

  2. Find parent box using $lookup

  3. then $group to get unique parent box

  4. then $project box name

So can try this query:

db.getCollection('boxes').aggregate(
        {$match:{"status":'ready'}},
        {$lookup: {from: "boxes", localField: "original", foreignField: "_id", as: "parent"}},
        {$unwind: {path: "$parent",preserveNullAndEmptyArrays: true}},
        {$group:{
                _id:null,
                list:{$addToSet:{"$cond": [ { "$ifNull": ["$parent.name", false] }, {name:"$parent.name"}, {name:"$name"} ]}}
                }
        },
        {$project:{name:"$list.name", _id:0}},
        {$unwind: "$name"}
 )

OR

  1. get record for status is ready
  2. get desired recordID
  3. get name according to recordID
db.getCollection('boxes').aggregate(
        {$match:{"status":'ready'}},
        {$group:{
                _id:null,
                parent:{$addToSet:{"$cond": [ { "$ifNull": ["$original", false] }, "$original", "$_id" ]}}
                }
        },
        {$unwind:"$parent"},
        {$lookup: {from: "boxes", localField: "parent", foreignField: "_id", as: "parent"}},
        {$project: {"name" : { $arrayElemAt: [ "$parent.name", 0 ] }, _id:0}}
 )

Using mongoose (4.x)

Schema:

const schema = mongoose.Schema({
    _id: Number,
    ....
    status: String,
    original: { type: Number, ref: 'Box'}
});
const Box = mongoose.model('Box', schema);

Actual Query:

Box
    .find({ status: 'ready' })
    .populate('original')
    .exec((err, boxes) => {
        if (err) return;
        boxes = boxes.map((b) => b.original ? b.original : b);
        boxes = _.uniqBy(boxes, '_id');
        console.log(boxes);
    });

Docs on Mongoose#populate: http://mongoosejs./docs/populate.html

发布评论

评论列表(0)

  1. 暂无评论