最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

design patterns - Handling Multi-Table Updates in Kafka - Stack Overflow

programmeradmin0浏览0评论

We have an application that serves as a configuration repository, storing data in a relational database. Whenever a user changes a configuration item, it is persisted to the database and our goal is to communicate these changes via Kafka messages to subscribers.

In the web UI, users can modify multiple sections of the configuration, often affecting multiple tables. After initial research, I understand that the typical way to model relational DB-like applications is by using one Kafka topic per table. However, this approach triggers multiple Kafka messages to different topics when a user modifies multiple tables.

How can consumers know how long to wait until all modifications are completed to react to changes? Or can you suggest an alternative way to model something similar?

We have an application that serves as a configuration repository, storing data in a relational database. Whenever a user changes a configuration item, it is persisted to the database and our goal is to communicate these changes via Kafka messages to subscribers.

In the web UI, users can modify multiple sections of the configuration, often affecting multiple tables. After initial research, I understand that the typical way to model relational DB-like applications is by using one Kafka topic per table. However, this approach triggers multiple Kafka messages to different topics when a user modifies multiple tables.

How can consumers know how long to wait until all modifications are completed to react to changes? Or can you suggest an alternative way to model something similar?

Share Improve this question edited Apr 1 at 19:59 Guru Stron 144k11 gold badges171 silver badges211 bronze badges asked Apr 1 at 18:22 Gii GvianiGii Gviani 28.5k4 gold badges39 silver badges42 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

After initial research, I understand that the typical way to model relational DB-like applications is by using one Kafka topic per table

It depends on the actual use-case. It is just fine to have a single topic which will have all the changes grouped into a single message if that is an appropriate approach. I.e. you will have message looking like the following:

{
   "user":"...",
   "changes":[
      {
         "entity":"one...",
          ... // changes
      },
      {
         "entity":"two...",
          ... // changes
      }
   ]
}

Note that if you can have A LOT of changes you might stumble upon the Kafka message size limit.

Another approach can be to have still a single topic but have a message per table and have some metadata that will have an unique "transaction"/operation id and number of changes in the "transaction"/operation:

{
   "user":"...",
   "metadata":{
      "transaction":"unique_tran_id",
      "operationNumber":"unique_operation_in_tran_id",
      "changesInTransaction":"total_num_of_changes"
   },
   "entity":"",
   ... // changes
}

Then you can have some saga-like approach when the processor will accumulate and count the processed operations and decide when processing is completed (using the count of the processed items + deduplication based on the operationNumber). This approach can also be spanned across multiple topics if needed. Though one note - IMO this approach is better be coupled with transactional outbox pattern so you minimize the chance that some "transaction"/operation will have only some messages published (i.e. processing of the "transaction"/operation will never complete).

Also do not fet - you are not limited to a single output Kafka. There are approaches when you have "business" and "technical" output queues, for example you can have a queue per table as "technical" ones and the "business" one using one of the first two approaches.

发布评论

评论列表(0)

  1. 暂无评论