I got a warning in my AWS console about the InnoDB history list length increasing on one cluster. Taking a look at show engine innodb status
, I see that the number in the "Purge done" line isn't changing from run to run:
------------
TRANSACTIONS
------------
Trx id counter 218703222601
Purge done for trx's n:o < 218637806467 undo n:o < 0 state: running but idle <---- Not changing
History list length 29748626
while the trx ID and list length are both increasing. There are no open transactions
mysql> SELECT * FROM information_schema.innodb_trx\G
Empty set (0.00 sec)
AWS's recommendations for fixing this are unhelpful, talking about long running transactions or tuning queries, neither of which is an issue (all queries running are in the sub-second range, and most of them are read queries).
mysql> show global variables like 'innodb%purge%';
+--------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------+-------+
| innodb_max_purge_lag | 0 |
| innodb_max_purge_lag_delay | 0 |
| innodb_purge_batch_size | 3600 |
| innodb_purge_rseg_truncate_frequency | 128 |
| innodb_purge_threads | 6 |
+--------------------------------------+-------+
Not sure if any of that is helpful, but I'm not sure what else to look at.
I got a warning in my AWS console about the InnoDB history list length increasing on one cluster. Taking a look at show engine innodb status
, I see that the number in the "Purge done" line isn't changing from run to run:
------------
TRANSACTIONS
------------
Trx id counter 218703222601
Purge done for trx's n:o < 218637806467 undo n:o < 0 state: running but idle <---- Not changing
History list length 29748626
while the trx ID and list length are both increasing. There are no open transactions
mysql> SELECT * FROM information_schema.innodb_trx\G
Empty set (0.00 sec)
AWS's recommendations for fixing this are unhelpful, talking about long running transactions or tuning queries, neither of which is an issue (all queries running are in the sub-second range, and most of them are read queries).
mysql> show global variables like 'innodb%purge%';
+--------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------+-------+
| innodb_max_purge_lag | 0 |
| innodb_max_purge_lag_delay | 0 |
| innodb_purge_batch_size | 3600 |
| innodb_purge_rseg_truncate_frequency | 128 |
| innodb_purge_threads | 6 |
+--------------------------------------+-------+
Not sure if any of that is helpful, but I'm not sure what else to look at.
Share Improve this question edited Jan 31 at 23:55 Ken White 126k15 gold badges236 silver badges464 bronze badges asked Jan 30 at 22:48 SwechslerSwechsler 1216 bronze badges 2- 1 I had the same thing happen a few months ago, on an Aurora serverless MySQL cluster. After about two weeks it mysteriously cleared up on its own. I had a support ticket open during that time, but they could never figure out why it was doing this, and it had cleared up by the time it had been escalated to the level of support that could actually look into it. I recommend you open an AWS support ticket with all the information in this question. I really think this is a bug in Aurora MySQL that they need to fix. You might also try posting at dba.stackexchange – Mark B Commented Jan 31 at 13:45
- Thanks, I did open a ticket with AWS this morning, but based on past experience, I don't expect much. – Swechsler Commented Jan 31 at 17:12
1 Answer
Reset to default 0AWS support got back to me, and asked me to run this to get a list of the old transactions across all nodes:
SELECT server_id, IF(session_id = 'master_session_id',
'writer', 'reader') AS ROLE,
replica_lag_in_msec,
oldest_read_view_trx_id ,
oldest_read_view_lsn
from mysql.ro_replica_status;
That told me that a reader was holding a transaction. Found a select
statement on that reader that had been stuck for several days. Killed that query and the problem is resolved.