最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

azure - Cosmos query to retrieve records with duplicate values in a list within a JSON - Stack Overflow

programmeradmin1浏览0评论

I want to write a Cosmos query that retrieves all records with duplicate 'tokenValue' entries where the type is 'type1'

{
        "id": "1234",
        "tokens": [
            {
                "type": "type1",
                "tokenValue": "1234"
            },
            {
                "type": "type2",
                "tokenValue": "5678"
            }
        ]
    }

For example, if I have an additional record like the one below, I want my query to retrieve that record as well

{
    "id": "abcd",
    "tokens": [
        {
            "type": "type1",
            "tokenValue": "1234"
        },
        {
            "type": "type2",
            "tokenValue": "91011"
        }
    ]
}

I want to write a Cosmos query that retrieves all records with duplicate 'tokenValue' entries where the type is 'type1'

{
        "id": "1234",
        "tokens": [
            {
                "type": "type1",
                "tokenValue": "1234"
            },
            {
                "type": "type2",
                "tokenValue": "5678"
            }
        ]
    }

For example, if I have an additional record like the one below, I want my query to retrieve that record as well

{
    "id": "abcd",
    "tokens": [
        {
            "type": "type1",
            "tokenValue": "1234"
        },
        {
            "type": "type2",
            "tokenValue": "91011"
        }
    ]
}
Share Improve this question edited Mar 19 at 5:05 David Makogon 70.9k22 gold badges145 silver badges198 bronze badges asked Mar 18 at 19:15 AvinashAvinash 2,1832 gold badges17 silver badges34 bronze badges 6
  • Could you please share what you tried – Balaji Commented Mar 19 at 1:59
  • I tried below SELECT * FROM c WHERE ARRAY_LENGTH( ARRAY( SELECT DISTINCT t.tokenValue FROM t IN c.tokens WHERE t.tokenType = "Accertify" AND ( SELECT COUNT(1) FROM c.tokens AS innerToken WHERE innerToken.tokenType = "Accertify" AND innerToken.tokenValue = t.tokenValue ) > 1 ) ) > 0 – Avinash Commented Mar 19 at 4:29
  • But above query did not work. I also tried so many queries but none of them worked. It would have been easy query in SQL using CTEs – Avinash Commented Mar 19 at 4:31
  • Please edit your question to provide properly-formatted code. Very difficult to read in a comment. Also, you need to show the results and why it isn’t working (a minimal reproducible example) – David Makogon Commented Mar 19 at 5:05
  • Yes, It is easy to find using CTE in SQL. But AFAIK in cosmos it is not supported. I tried by using c# and retrieved duplicate values successfully. – Balaji Commented Mar 19 at 5:38
 |  Show 1 more comment

1 Answer 1

Reset to default 0

Cosmos query to retrieve records with duplicate values in a list within a JSON

Yes, you can achieve it by using c#. I tried by using cosmos query but as the data is containing duplicates across multiple items, it's not retrieving the duplicate values. Try with the below code, it retrieves all documents that contain at least one tokens array entry where type = "type1" and later it checks only tokenValue values where type = "type1". Now, it finds duplicates using Group by tokenValue to retrieve the duplicate values in a list within a JSON as shown in the below output.

static async Task Main()
    {
        using CosmosClient client = new CosmosClient(EndpointUri, PrimaryKey);
        var container = client.GetContainer(DatabaseId, ContainerId);

        string query = @"
            SELECT c.id, c.tokens
            FROM c
            WHERE EXISTS (SELECT VALUE t FROM t IN c.tokens WHERE t.type = 'type1')";

        var tokenRecords = new List<(string id, JObject document, List<string> tokenValues)>();

        using FeedIterator<dynamic> resultSet = container.GetItemQueryIterator<dynamic>(query);

        while (resultSet.HasMoreResults)
        {
            foreach (var item in await resultSet.ReadNextAsync())
            {
                var document = JObject.FromObject(item);
                var id = document["id"]?.ToString();

                var tokenValues = ((JArray)document["tokens"])
                    .Where(t => t["type"]?.ToString() == "type1")
                    .Select(t => t["tokenValue"]?.ToString())
                    .ToList();

                tokenRecords.Add((id, document, tokenValues));
            }
        }

        var duplicateTokens = tokenRecords
            .SelectMany(r => r.tokenValues.Select(tv => (r.id, tv, r.document)))
            .GroupBy(x => x.tv)
            .Where(g => g.Count() > 1)
            .SelectMany(g => g.Select(x => x.document))
            .Distinct()
            .ToList();

        Console.WriteLine("Documents with duplicate tokenValue:");
        foreach (var doc in duplicateTokens)
        {
            Console.WriteLine(doc.ToString()); 
        }
    }

Output:

Documents with duplicate tokenValue:
{
  "id": "1234",
  "tokens": [
    {
      "type": "type1",
      "tokenValue": "1234"
    },
    {
      "type": "type2",
      "tokenValue": "5678"
    }
  ]
}
{
  "id": "abcd",
  "tokens": [
    {
      "type": "type1",
      "tokenValue": "1234"
    },
    {
      "type": "type2",
      "tokenValue": "91011"
    }
  ]
}
发布评论

评论列表(0)

  1. 暂无评论