最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

c# - Cosmos DB Inserts All Data, but SQL Server Only Inserts Some Records Within a Transaction - Stack Overflow

programmeradmin2浏览0评论

I am working on a system that initially stored data in Azure SQL Server. However, due to limitations in the table design that made it unsuitable for analyzing a promotion activity, we decided to also store the data in a NoSQL database for better querying and analysis.

To ensure consistency, I use a transaction when saving data to Azure SQL Server and Azure Cosmos DB. My understanding is that both _dbContext.SaveChangesAsync and transaction.CommitAsync are atomic operations, so I expected all or none of the records to be stored. According to the official documentation, "the transaction will auto-rollback when disposed if either command fails." However, I noticed that:

  • All data is successfully inserted into Azure Cosmos DB.
  • Only some records appear in Azure SQL Server.

Additional Information:

  • Expected Behavior:
    • Both SQL Server and Cosmos DB should store the same number of records.
  • Things I Have Checked:
    • redPoints contains the expected number of elements before insertion.
    • No unique constraints, triggers, or other SQL Server table restrictions should be causing silent rejections.
    • We recently changed our logging system, so we lost past log data, making debugging more difficult.
    • We have not enabled the SQL retry strategy.
  • Tools:
    • NET: 6.0
    • Microsoft.EntityFrameworkCore: 7.0.7
    • MongoDB.Drive: 2.22.0

Questions:

  • What could cause SQL Server to insert only some of the records while Cosmos DB successfully inserts all of them?
  • Is there a potential issue with SaveChangesAsync or transaction handling?
  • How can I debug this issue further?

Here is the simplfied code:

private async Task SaveRedPointsToDb(
    List<RedPoint> redPoints,
    CancellationToken cancellationToken
)
{
    await using var transaction = await _dbContext.Database.BeginTransactionAsync(
        cancellationToken
    );
    try
    {
        // Azure SQL Server
        await _dbContext.RedPoints.AddRangeAsync(redPoints, cancellationToken);
        await _dbContext.SaveChangesAsync(cancellationToken);

        // Azure Cosmos DB
        var redPointDocuments = redPoints.Select(x => new RedPointDocument(x)).toList();
        await _redPointCollection.InsertManyAsync(redPointDocuments);

        await transaction.CommitAsync(cancellationToken);
    }
    catch (Exception e)
    {
        // Log...
    }
}

I am working on a system that initially stored data in Azure SQL Server. However, due to limitations in the table design that made it unsuitable for analyzing a promotion activity, we decided to also store the data in a NoSQL database for better querying and analysis.

To ensure consistency, I use a transaction when saving data to Azure SQL Server and Azure Cosmos DB. My understanding is that both _dbContext.SaveChangesAsync and transaction.CommitAsync are atomic operations, so I expected all or none of the records to be stored. According to the official documentation, "the transaction will auto-rollback when disposed if either command fails." However, I noticed that:

  • All data is successfully inserted into Azure Cosmos DB.
  • Only some records appear in Azure SQL Server.

Additional Information:

  • Expected Behavior:
    • Both SQL Server and Cosmos DB should store the same number of records.
  • Things I Have Checked:
    • redPoints contains the expected number of elements before insertion.
    • No unique constraints, triggers, or other SQL Server table restrictions should be causing silent rejections.
    • We recently changed our logging system, so we lost past log data, making debugging more difficult.
    • We have not enabled the SQL retry strategy.
  • Tools:
    • NET: 6.0
    • Microsoft.EntityFrameworkCore: 7.0.7
    • MongoDB.Drive: 2.22.0

Questions:

  • What could cause SQL Server to insert only some of the records while Cosmos DB successfully inserts all of them?
  • Is there a potential issue with SaveChangesAsync or transaction handling?
  • How can I debug this issue further?

Here is the simplfied code:

private async Task SaveRedPointsToDb(
    List<RedPoint> redPoints,
    CancellationToken cancellationToken
)
{
    await using var transaction = await _dbContext.Database.BeginTransactionAsync(
        cancellationToken
    );
    try
    {
        // Azure SQL Server
        await _dbContext.RedPoints.AddRangeAsync(redPoints, cancellationToken);
        await _dbContext.SaveChangesAsync(cancellationToken);

        // Azure Cosmos DB
        var redPointDocuments = redPoints.Select(x => new RedPointDocument(x)).toList();
        await _redPointCollection.InsertManyAsync(redPointDocuments);

        await transaction.CommitAsync(cancellationToken);
    }
    catch (Exception e)
    {
        // Log...
    }
}
Share Improve this question asked Feb 7 at 7:51 Ethan GuoEthan Guo 515 bronze badges 5
  • Is that _dbContext instance fresh and only used here? Or does it already have RedPoint objects tracked? Can you test with creating a new _dbContext just at the start of your method. – rene Commented Feb 7 at 8:07
  • 1 Using Cosmos "for better querying and analysis" vs Azure SQL database definitely sounds weird. Azure SQL database is objectively better at this due to much less limited query syntax (allowing you to join different documents for example) and ability to provide secondary indexes and not being constrained by the single partition key decision – Martin Smith Commented Feb 7 at 8:14
  • I see you are using the Mongo API so technically can use an aggregation pipeline with $lookup to get the gateway to perform joins for you but I have found RU costs for this are not cheap and AFAIK the gateway is essentially just sending the individual lookup requests one by one – Martin Smith Commented Feb 7 at 8:26
  • @rene Thanks for your helpful reply. The dbContext is registered with a Scoped lifetime and it is fresh. Before executing the code snippet, only Order objects were being tracked. This code has been running in production for weeks, and we encountered the issue only on one day. Despite inputting the same data in our debugging environment, we haven't been able to reproduce the problem. – Ethan Guo Commented Feb 7 at 8:44
  • @MartinSmith I agree that SQL is generally better for querying and analysis. However, due to our table design, we need to use LIKE to filter a large dataset and perform additional string processing. After testing, we found that using NoSQL significantly improved performance for our specific use case, which is why we went this route. – Ethan Guo Commented Feb 7 at 8:55
Add a comment  | 

1 Answer 1

Reset to default 0

The code looks ok. As written a commit failure on SQL would cause that. In SQL Server commit failures are rare, and typically only happen when you have an IO issue like running out of space on the transaction log, or are using in-memory OLTP which does commit-time conflict resolution. Your logging should have caught that.

Another possible cause is a crash in the client program after writing to Cosmos but before committing in SQL server.

Also the rows may have been inserted, and then deleted in SQL Server.

In general I would recommend abandoning the dual-write pattern, in favor of an eventually-consistent "write-after" pattern. EG write to SQL and include a flag on the row, or a row in a "queue table" indicating that the row needs to be replicated to Cosmos. Then have a background process that copies all such rows and updates the flag or removes the rows from the queue table.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论