最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

How to implement a data sync service to one-directionally between a main SQL Server database and a secondary read-only database

programmeradmin2浏览0评论

I have a main internal app that does some sort of business management. A mobile app has been developed where just a part of the data in the main DB has to be exposed to the mobile app.

The biggest non-functional requirement is performance, it has to be "blazingly fast". The mobile app has to read data "blazingly fast", and it will have only one action/mutation, so I was thinking to do a read-only replica of the main database where the mobile app can go to fetch data, and the only one action/mutation can go directly in the main database, since there will be quite a bit fewer of those and they are less important.

Here is a high-level architectural view of this:

In order for the queries to be optimized, the Read database will have strictly the data needed but in a different form compared to the Main database, meaning, maybe different table relationships with much less columns. For this there will have to be a Data Sync Service, that watches the Main database for specific changes, transforms the data into a predefined read-only form, then sends/inserts the data to/into the Read database, each and every time something new like a insertion or update happened in the Main database.

The latency between the changes in the Main database and the insertion in the Read database should be as low as possible, mimicking or actually being some sort of real-time communication. And also, the performance impact on the Main database would be nice to little to none, ideally it should not even know about the existence of what I have talked about here.

Besides classic backend database stuff, meaning a non-exhaustive list of queries, mutations, creating tables, relationships, indexes, I had never done stuff even remotely similar to what I have described here (meaning syncing databases).

My main question is how do I come about designing and implementing this Data Sync Service. What will this Data Sync Service be? Is it a Microsoft set of tools (that I have never heard about) that could do this at the database Level, is it a hand-written backend service that can somehow watch database changes without constantly manually query-ing the Main database, is it a combination of both?

I am having a hard time thinking about how to do this, since like I said, I never did something even remotely like this.

The only thing I can think about with my knowledge is to have a backend service for this that will listen for the specific database changes, model the data, then insert into the Read database, but I have no idea how to "listen" for the Main database changes. I'd like to "listen" only for specific changes and those to be asynchronously send to my service, putting as little load on the Main database as possible.

I use Microsoft tech, SQL Server and .NET, if that helps, but I have no cloud, just VPS'es, so no cloud services.

And yes, realistically, first and foremost all the existing data in the Main database has to be migrated in the form of the Read database and only then the Sync Data Service comes into action to do updates, but this is a separate task that I can manage.

Thanks.

Edit 1: So far I have found out about two possible solutions. Using SQL Server Service Broker, the innate async messaging queue of SQL Server. For this I would have to make triggers for the changes I am looking for and send them to the Read Database with this messaging queue. Either consume the Read Database queue from a backend service, model data, then insert, or maybe even do it all inside the database with some mapping and inserting stored procedures. The other possible solution is to use a tool like Debezium, which listens to changes from the database transaction log and sends them to a Kafka Topic and a backend service consumes the Topic, mapps and inserts into the Read Database. This solution seems more performant because we don't have triggers adding an overhead to the Main Database, but it requires additional infrastructure (Kafka).

I have a main internal app that does some sort of business management. A mobile app has been developed where just a part of the data in the main DB has to be exposed to the mobile app.

The biggest non-functional requirement is performance, it has to be "blazingly fast". The mobile app has to read data "blazingly fast", and it will have only one action/mutation, so I was thinking to do a read-only replica of the main database where the mobile app can go to fetch data, and the only one action/mutation can go directly in the main database, since there will be quite a bit fewer of those and they are less important.

Here is a high-level architectural view of this:

In order for the queries to be optimized, the Read database will have strictly the data needed but in a different form compared to the Main database, meaning, maybe different table relationships with much less columns. For this there will have to be a Data Sync Service, that watches the Main database for specific changes, transforms the data into a predefined read-only form, then sends/inserts the data to/into the Read database, each and every time something new like a insertion or update happened in the Main database.

The latency between the changes in the Main database and the insertion in the Read database should be as low as possible, mimicking or actually being some sort of real-time communication. And also, the performance impact on the Main database would be nice to little to none, ideally it should not even know about the existence of what I have talked about here.

Besides classic backend database stuff, meaning a non-exhaustive list of queries, mutations, creating tables, relationships, indexes, I had never done stuff even remotely similar to what I have described here (meaning syncing databases).

My main question is how do I come about designing and implementing this Data Sync Service. What will this Data Sync Service be? Is it a Microsoft set of tools (that I have never heard about) that could do this at the database Level, is it a hand-written backend service that can somehow watch database changes without constantly manually query-ing the Main database, is it a combination of both?

I am having a hard time thinking about how to do this, since like I said, I never did something even remotely like this.

The only thing I can think about with my knowledge is to have a backend service for this that will listen for the specific database changes, model the data, then insert into the Read database, but I have no idea how to "listen" for the Main database changes. I'd like to "listen" only for specific changes and those to be asynchronously send to my service, putting as little load on the Main database as possible.

I use Microsoft tech, SQL Server and .NET, if that helps, but I have no cloud, just VPS'es, so no cloud services.

And yes, realistically, first and foremost all the existing data in the Main database has to be migrated in the form of the Read database and only then the Sync Data Service comes into action to do updates, but this is a separate task that I can manage.

Thanks.

Edit 1: So far I have found out about two possible solutions. Using SQL Server Service Broker, the innate async messaging queue of SQL Server. For this I would have to make triggers for the changes I am looking for and send them to the Read Database with this messaging queue. Either consume the Read Database queue from a backend service, model data, then insert, or maybe even do it all inside the database with some mapping and inserting stored procedures. The other possible solution is to use a tool like Debezium, which listens to changes from the database transaction log and sends them to a Kafka Topic and a backend service consumes the Topic, mapps and inserts into the Read Database. This solution seems more performant because we don't have triggers adding an overhead to the Main Database, but it requires additional infrastructure (Kafka).

Share Improve this question edited Jan 21 at 9:56 Timotei Oros asked Jan 20 at 13:19 Timotei OrosTimotei Oros 1071 silver badge8 bronze badges 2
  • Do you need changes to main db from non-APP sources to be available in your mobile apps too? – siggemannen Commented Jan 20 at 13:37
  • @siggemannen Given that Main DB is part of an overall internal app where I am the admin, when I do an insert or update to table X from the app, that insert or update needs to be synced with the Read DB. Also, only a handful of tables should be "watched". Changes to Main DB can only occur from the internal app and eventually from that single possible action in the Mobile App. – Timotei Oros Commented Jan 20 at 13:45
Add a comment  | 

1 Answer 1

Reset to default 0

"The biggest non-functional requirement is performance, it has to be "blazingly fast". The mobile app has to read data "blazingly fast", and it will have only one action/mutation, so I was thinking to do a read-only replica of the main database"

You're trying to solve a design problem with an architecture solution. It's premature, expensive, and probably overkill.

Instead consider something simple like an indexed view that provides the data that the mobile app needs from the main database, and make sure you're using READ COMMITTED SNAPSHOT isolation.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论