cassandra - Writes fail when lightweight transactions cannot reach quorum

In three node Cassandra cluster I am consistently facing the same kind of fatal situation on tables that are solely written using Cassandra's lightweight transactions (CAS).

Whenever a lightweight transaction fails to reach quorum (1/2), e.g. due to high load, any following attempt to write data within a transactions fails, i.e. does not return "[applied]"=true.

Using select * from system.paxos where cf_id=<id of table>, I see that there are entries, which I assume to be pending transactions.

Further, in /var/log/Cassandra/system.log I see logs like:

INFO  [ScheduledTasks:1] 2025-01-12 21:46:53,005 UncommittedTableData.java:567 - \
  Scheduling uncommitted paxos data merge task for `<any other table>

INFO  [OptionalTasks:1] 2025-01-12 21:46:53,006 PaxosCleanupLocalCoordinator.java:89 - \
  Completing uncommitted paxos instances for <table in stalled state> on ranges

However, I can't figure how to resolve the state nodetool repair -full <keyspace> (and variations), as well as restarting all nodes did not resolve the issue.

Further information:

Cassandra version: 4.1.5
replication strategy: SimpleStrategy
replication factor: 3

In three node Cassandra cluster I am consistently facing the same kind of fatal situation on tables that are solely written using Cassandra's lightweight transactions (CAS).

Whenever a lightweight transaction fails to reach quorum (1/2), e.g. due to high load, any following attempt to write data within a transactions fails, i.e. does not return "[applied]"=true.

Using select * from system.paxos where cf_id=<id of table>, I see that there are entries, which I assume to be pending transactions.

Further, in /var/log/Cassandra/system.log I see logs like:

INFO  [ScheduledTasks:1] 2025-01-12 21:46:53,005 UncommittedTableData.java:567 - \
  Scheduling uncommitted paxos data merge task for `<any other table>

INFO  [OptionalTasks:1] 2025-01-12 21:46:53,006 PaxosCleanupLocalCoordinator.java:89 - \
  Completing uncommitted paxos instances for <table in stalled state> on ranges

However, I can't figure how to resolve the state nodetool repair -full <keyspace> (and variations), as well as restarting all nodes did not resolve the issue.

Further information:

Cassandra version: 4.1.5
replication strategy: SimpleStrategy
replication factor: 3

Share Improve this question edited Jan 24 at 4:56 Erick Ramirez 16.4k2 gold badges21 silver badges31 bronze badges asked Jan 13 at 7:10 PeMa 1,71620 silver badges49 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

Lightweight transactions (LWTs) are expensive operations since they require a read-before-write, meaning the data must be read to verify the conditional IF in the statement before the write is executed.

Prior to Paxos v2 added in Cassandra 4.1 (CASSANDRA-17164), LWTs required four round-trips for the [extended] Paxos phases: prepare/promise, serial read, propose/accept, commit. As a result, LWTs add significantly more load than regular writes. As such, if nodes are overloaded then it is expected for LWTs to perform even worse and not reach a quorum of replicas.

Running a repair does not solve the underlying issue with the nodes being overloaded. In fact, repairs add even more load like adding more fuel to a cluster that's on fire.

You should address the root cause of the problem. I recommend that you review the capacity of your cluster and analyse the utilisation of resources like disk, CPU and memory. It may be necessary for you to consider adding more nodes. Cheers!

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

cassandra - Writes fail when lightweight transactions cannot reach quorum - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)