I am working on a system that initially stored data in Azure SQL Server. However, due to limitations in the table design that made it unsuitable for analyzing a promotion activity, we decided to also store the data in a NoSQL database for better querying and analysis.
To ensure consistency, I use a transaction when saving data to Azure SQL Server and Azure Cosmos DB. My understanding is that both _dbContext.SaveChangesAsync
and transaction.CommitAsync
are atomic operations, so I expected all or none of the records to be stored. According to the official documentation, "the transaction will auto-rollback when disposed if either command fails." However, I noticed that:
- All data is successfully inserted into Azure Cosmos DB.
- Only some records appear in Azure SQL Server.
Additional Information:
- Expected Behavior:
- Both SQL Server and Cosmos DB should store the same number of records.
- Things I Have Checked:
redPoints
contains the expected number of elements before insertion.- No unique constraints, triggers, or other SQL Server table restrictions should be causing silent rejections.
- We recently changed our logging system, so we lost past log data, making debugging more difficult.
- We have not enabled the SQL retry strategy.
- Tools:
- NET: 6.0
- Microsoft.EntityFrameworkCore: 7.0.7
- MongoDB.Drive: 2.22.0
Questions:
- What could cause SQL Server to insert only some of the records while Cosmos DB successfully inserts all of them?
- Is there a potential issue with
SaveChangesAsync
or transaction handling? - How can I debug this issue further?
Here is the simplfied code:
private async Task SaveRedPointsToDb(
List<RedPoint> redPoints,
CancellationToken cancellationToken
)
{
await using var transaction = await _dbContext.Database.BeginTransactionAsync(
cancellationToken
);
try
{
// Azure SQL Server
await _dbContext.RedPoints.AddRangeAsync(redPoints, cancellationToken);
await _dbContext.SaveChangesAsync(cancellationToken);
// Azure Cosmos DB
var redPointDocuments = redPoints.Select(x => new RedPointDocument(x)).toList();
await _redPointCollection.InsertManyAsync(redPointDocuments);
await transaction.CommitAsync(cancellationToken);
}
catch (Exception e)
{
// Log...
}
}
I am working on a system that initially stored data in Azure SQL Server. However, due to limitations in the table design that made it unsuitable for analyzing a promotion activity, we decided to also store the data in a NoSQL database for better querying and analysis.
To ensure consistency, I use a transaction when saving data to Azure SQL Server and Azure Cosmos DB. My understanding is that both _dbContext.SaveChangesAsync
and transaction.CommitAsync
are atomic operations, so I expected all or none of the records to be stored. According to the official documentation, "the transaction will auto-rollback when disposed if either command fails." However, I noticed that:
- All data is successfully inserted into Azure Cosmos DB.
- Only some records appear in Azure SQL Server.
Additional Information:
- Expected Behavior:
- Both SQL Server and Cosmos DB should store the same number of records.
- Things I Have Checked:
redPoints
contains the expected number of elements before insertion.- No unique constraints, triggers, or other SQL Server table restrictions should be causing silent rejections.
- We recently changed our logging system, so we lost past log data, making debugging more difficult.
- We have not enabled the SQL retry strategy.
- Tools:
- NET: 6.0
- Microsoft.EntityFrameworkCore: 7.0.7
- MongoDB.Drive: 2.22.0
Questions:
- What could cause SQL Server to insert only some of the records while Cosmos DB successfully inserts all of them?
- Is there a potential issue with
SaveChangesAsync
or transaction handling? - How can I debug this issue further?
Here is the simplfied code:
private async Task SaveRedPointsToDb(
List<RedPoint> redPoints,
CancellationToken cancellationToken
)
{
await using var transaction = await _dbContext.Database.BeginTransactionAsync(
cancellationToken
);
try
{
// Azure SQL Server
await _dbContext.RedPoints.AddRangeAsync(redPoints, cancellationToken);
await _dbContext.SaveChangesAsync(cancellationToken);
// Azure Cosmos DB
var redPointDocuments = redPoints.Select(x => new RedPointDocument(x)).toList();
await _redPointCollection.InsertManyAsync(redPointDocuments);
await transaction.CommitAsync(cancellationToken);
}
catch (Exception e)
{
// Log...
}
}
Share
Improve this question
asked Feb 7 at 7:51
Ethan GuoEthan Guo
515 bronze badges
5
|
1 Answer
Reset to default 0The code looks ok. As written a commit failure on SQL would cause that. In SQL Server commit failures are rare, and typically only happen when you have an IO issue like running out of space on the transaction log, or are using in-memory OLTP which does commit-time conflict resolution. Your logging should have caught that.
Another possible cause is a crash in the client program after writing to Cosmos but before committing in SQL server.
Also the rows may have been inserted, and then deleted in SQL Server.
In general I would recommend abandoning the dual-write pattern, in favor of an eventually-consistent "write-after" pattern. EG write to SQL and include a flag on the row, or a row in a "queue table" indicating that the row needs to be replicated to Cosmos. Then have a background process that copies all such rows and updates the flag or removes the rows from the queue table.
$lookup
to get the gateway to perform joins for you but I have found RU costs for this are not cheap and AFAIK the gateway is essentially just sending the individual lookup requests one by one – Martin Smith Commented Feb 7 at 8:26Scoped
lifetime and it is fresh. Before executing the code snippet, only Order objects were being tracked. This code has been running in production for weeks, and we encountered the issue only on one day. Despite inputting the same data in our debugging environment, we haven't been able to reproduce the problem. – Ethan Guo Commented Feb 7 at 8:44LIKE
to filter a large dataset and perform additional string processing. After testing, we found that using NoSQL significantly improved performance for our specific use case, which is why we went this route. – Ethan Guo Commented Feb 7 at 8:55