最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

mongodb - Jolt Transformation To Replace a parent key with its nested key value If present otherwise just print - Stack Overflow

programmeradmin4浏览0评论

I am trying to write a Jolt transformation in Apache Nifi. Just for a background, in the Nifi pipeline I am reading from a MongoDB collection and inserting into ClickhouseDB. Now during reading all ObjectId keys are represented as $oid in the JSON which is creating a problem during writing into Clickhouse.

I am thinking of using a Jolt Transformation step as I want a more dynamic way to handle any ObjectId key.

For eg: Sample Input

{
  "_id": {
    "$oid": "67e3b577b9897ea76e00bd9e"
  },
  "relationshipId": "67e3b541b9897ea76e007679-65316a69e9d1652da805b106-9810446605",
  "0To1Month": "",
  "groupId": {
    "$oid": "60d6e2b16bf83142a381f5fb"
  },
  "groupLastAttemptDayAgo": 1742978685000,
  "nested": {
    "okdok": {
      "$oid": "60d6e2b16bf83142a381f5f9"
    }
  }
}

Sample Output

{
  "_id" : "67e3b577b9897ea76e00bd9e",
  "relationshipId" : "67e3b541b9897ea76e007679-65316a69e9d1652da805b106-9810446605",
  "0To1Month" : "",
  "groupId" : "60d6e2b16bf83142a381f5fb",
  "groupLastAttemptDayAgo" : 1742978685000,
  "nested" : {
    "okdok" : "60d6e2b16bf83142a381f5f9"
  }
}

I know how to replace the key like the below Jolt does, but I am facing a problem trying to include the other keys of the document (eg from above sample the relationshipId, 0To1Month etc). Help :(

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "\\$oid": "&1"
      }
    }
  }
]

This Operation gives an output like so

{
  "_id" : "67e3b577b9897ea76e00bd9e",
  "groupId" : "60d6e2b16bf83142a381f5fb"
}

I am trying to write a Jolt transformation in Apache Nifi. Just for a background, in the Nifi pipeline I am reading from a MongoDB collection and inserting into ClickhouseDB. Now during reading all ObjectId keys are represented as $oid in the JSON which is creating a problem during writing into Clickhouse.

I am thinking of using a Jolt Transformation step as I want a more dynamic way to handle any ObjectId key.

For eg: Sample Input

{
  "_id": {
    "$oid": "67e3b577b9897ea76e00bd9e"
  },
  "relationshipId": "67e3b541b9897ea76e007679-65316a69e9d1652da805b106-9810446605",
  "0To1Month": "",
  "groupId": {
    "$oid": "60d6e2b16bf83142a381f5fb"
  },
  "groupLastAttemptDayAgo": 1742978685000,
  "nested": {
    "okdok": {
      "$oid": "60d6e2b16bf83142a381f5f9"
    }
  }
}

Sample Output

{
  "_id" : "67e3b577b9897ea76e00bd9e",
  "relationshipId" : "67e3b541b9897ea76e007679-65316a69e9d1652da805b106-9810446605",
  "0To1Month" : "",
  "groupId" : "60d6e2b16bf83142a381f5fb",
  "groupLastAttemptDayAgo" : 1742978685000,
  "nested" : {
    "okdok" : "60d6e2b16bf83142a381f5f9"
  }
}

I know how to replace the key like the below Jolt does, but I am facing a problem trying to include the other keys of the document (eg from above sample the relationshipId, 0To1Month etc). Help :(

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "\\$oid": "&1"
      }
    }
  }
]

This Operation gives an output like so

{
  "_id" : "67e3b577b9897ea76e00bd9e",
  "groupId" : "60d6e2b16bf83142a381f5fb"
}
Share Improve this question edited Mar 28 at 5:33 tomsajuk asked Mar 27 at 7:53 tomsajuktomsajuk 12 bronze badges 4
  • Why not use a MongoDB aggregation to do the transformation? Save it as a View and then use that read and load data into Clickhouse. mongoplayground/p/3uT1wRlmCqf. Wrt "but I am facing a problem trying to include the other keys." - what other keys? Be more specific about what is working/not working. – aneroid Commented Mar 27 at 8:20
  • @aneroid The other keys from the document, like the relationshipId, 0To1Month etc. Yes the mongoDB aggregation works but its not dynamic enough. As in I would have to go through the document for each collection that we are planning to include and convert each ObjectId present to a string, which is just a maintenance headache. – tomsajuk Commented Mar 27 at 9:45
  • Where should the value 60d6e2b16bf83142a381f5f9 within the sample (presumingly expected ) output come from ? Is that a typo that stems from 60d6e2b16bf83142a381f5fb (ending with b instead of 9 ) ? – Barbaros Özhan Commented Mar 27 at 12:40
  • @BarbarosÖzhan yes that was a typo. Sorry for the confusion. Edited. – tomsajuk Commented Mar 28 at 5:31
Add a comment  | 

1 Answer 1

Reset to default 0

You can use the following shift transformation spec

[
  {
    "operation": "shift",
    "spec": {
      "_id|groupId": {
        "\\$oid": "&1"
      },
      "nested": {
        "okdok": {
          "@2,groupId.\\$oid": "&2.&1.\\$oid"
        }
      },
      "*": "&"//else case
    }
  }
]

where the main trick is escaping the dollar signs through use of double back-slashes.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论