database - QuestDB ApplyWal2TableJob error and invalid column size

I have a script using psycopg to ingest data in QuestDB, as seen in the docs. After a while I see this error on the logs and the table is marked as suspended

PGConnectionContext error [msg=`Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]`, errno=0]

This was in QuestDB 8.8.2.

I saw the log had extra information complaining about WAL log apply.

2025-02-21T12:03:21.253673Z E i.q.c.w.ApplyWal2TableJob error applying SQL to wal table [table=solana_create_events, sql=UPDATE solana_create_events
                set ipfs_data = '{"name": "REDACTED", "symbol": "CRYPTOBALL", "description": "REDACTED ", "image": "redacted", "showName": true, "createdOn": ";}', updated_at = '2025-02-21T11:31:36.522293'::timestamp
                where ts = '2024-12-27 00:00:10'
                and token = 'redacted'
                , error=Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0], errno=0]
2025-02-21T12:03:21.254316Z C i.q.c.w.ApplyWal2TableJob job failed, table suspended [table=solana_create_events~9, seqTxn=30797315, error=
io.questdb.cairo.CairoException: [0] Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]
    at io.questdb.cairo.CairoException.instance(CairoException.java:326)
    at io.questdb.cairo.CairoException.critical(CairoException.java:68)
    at io.questdb.cairo.TableReader.reloadColumnAt(TableReader.java:1319)
    at io.questdb.cairo.TableReader.openPartitionColumns(TableReader.java:1015)
    at io.questdb.cairo.TableReader.openPartition0(TableReader.java:982)
    at io.questdb.cairo.TableReader.openPartition(TableReader.java:459)
    at io.questdb.cairo.pool.ReaderPool$R.openPartition(ReaderPool.java:137)
    at io.questdb.cairo.IntervalFwdPartitionFrameCursor.next(IntervalFwdPartitionFrameCursor.java:56)
    at io.questdb.griffin.engine.table.FwdTableReaderPageFrameCursor.next(FwdTableReaderPageFrameCursor.java:120)
    at io.questdb.griffin.engine.table.PageFrameRecordCursorImpl.hasNext(PageFrameRecordCursorImpl.java:111)
    at io.questdb.griffin.engine.table.FilteredRecordCursor.hasNext(FilteredRecordCursor.java:66)
    at io.questdb.griffin.engine.AbstractVirtualFunctionRecordCursor.hasNext(AbstractVirtualFunctionRecordCursor.java:81)
    at io.questdb.griffin.UpdateOperatorImpl.executeUpdate(UpdateOperatorImpl.java:139)
    at io.questdb.griffin.engine.ops.UpdateOperation.apply(UpdateOperation.java:74)
    at io.questdb.cairo.TableWriter.apply(TableWriter.java:713)
    at io.questdb.cairo.wal.OperationExecutor.executeUpdate(OperationExecutor.java:98)
    at io.questdb.cairo.wal.ApplyWal2TableJob.processWalSql(ApplyWal2TableJob.java:564)
    at io.questdb.cairo.wal.ApplyWal2TableJob.processWalCommit(ApplyWal2TableJob.java:533)
    at io.questdb.cairo.wal.ApplyWal2TableJob.applyOutstandingWalTransactions(ApplyWal2TableJob.java:377)
    at io.questdb.cairo.wal.ApplyWal2TableJob.applyWal(ApplyWal2TableJob.java:658)
    at io.questdb.cairo.wal.ApplyWal2TableJob.doRun(ApplyWal2TableJob.java:707)
    at io.questdb.mp.AbstractQueueConsumerJob.run(AbstractQueueConsumerJob.java:50)
    at io.questdb.mp.Worker.run(Worker.java:152)

So I stopped WAL for the table then resumed. It seems to be working now, but not sure if this is the right workaround or why this happened in the first place

I have a script using psycopg to ingest data in QuestDB, as seen in the docs. After a while I see this error on the logs and the table is marked as suspended

PGConnectionContext error [msg=`Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]`, errno=0]

This was in QuestDB 8.8.2.

I saw the log had extra information complaining about WAL log apply.

2025-02-21T12:03:21.253673Z E i.q.c.w.ApplyWal2TableJob error applying SQL to wal table [table=solana_create_events, sql=UPDATE solana_create_events
                set ipfs_data = '{"name": "REDACTED", "symbol": "CRYPTOBALL", "description": "REDACTED ", "image": "redacted", "showName": true, "createdOn": "https://pump.fun"}', updated_at = '2025-02-21T11:31:36.522293'::timestamp
                where ts = '2024-12-27 00:00:10'
                and token = 'redacted'
                , error=Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0], errno=0]
2025-02-21T12:03:21.254316Z C i.q.c.w.ApplyWal2TableJob job failed, table suspended [table=solana_create_events~9, seqTxn=30797315, error=
io.questdb.cairo.CairoException: [0] Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]
    at io.questdb.cairo.CairoException.instance(CairoException.java:326)
    at io.questdb.cairo.CairoException.critical(CairoException.java:68)
    at io.questdb.cairo.TableReader.reloadColumnAt(TableReader.java:1319)
    at io.questdb.cairo.TableReader.openPartitionColumns(TableReader.java:1015)
    at io.questdb.cairo.TableReader.openPartition0(TableReader.java:982)
    at io.questdb.cairo.TableReader.openPartition(TableReader.java:459)
    at io.questdb.cairo.pool.ReaderPool$R.openPartition(ReaderPool.java:137)
    at io.questdb.cairo.IntervalFwdPartitionFrameCursor.next(IntervalFwdPartitionFrameCursor.java:56)
    at io.questdb.griffin.engine.table.FwdTableReaderPageFrameCursor.next(FwdTableReaderPageFrameCursor.java:120)
    at io.questdb.griffin.engine.table.PageFrameRecordCursorImpl.hasNext(PageFrameRecordCursorImpl.java:111)
    at io.questdb.griffin.engine.table.FilteredRecordCursor.hasNext(FilteredRecordCursor.java:66)
    at io.questdb.griffin.engine.AbstractVirtualFunctionRecordCursor.hasNext(AbstractVirtualFunctionRecordCursor.java:81)
    at io.questdb.griffin.UpdateOperatorImpl.executeUpdate(UpdateOperatorImpl.java:139)
    at io.questdb.griffin.engine.ops.UpdateOperation.apply(UpdateOperation.java:74)
    at io.questdb.cairo.TableWriter.apply(TableWriter.java:713)
    at io.questdb.cairo.wal.OperationExecutor.executeUpdate(OperationExecutor.java:98)
    at io.questdb.cairo.wal.ApplyWal2TableJob.processWalSql(ApplyWal2TableJob.java:564)
    at io.questdb.cairo.wal.ApplyWal2TableJob.processWalCommit(ApplyWal2TableJob.java:533)
    at io.questdb.cairo.wal.ApplyWal2TableJob.applyOutstandingWalTransactions(ApplyWal2TableJob.java:377)
    at io.questdb.cairo.wal.ApplyWal2TableJob.applyWal(ApplyWal2TableJob.java:658)
    at io.questdb.cairo.wal.ApplyWal2TableJob.doRun(ApplyWal2TableJob.java:707)
    at io.questdb.mp.AbstractQueueConsumerJob.run(AbstractQueueConsumerJob.java:50)
    at io.questdb.mp.Worker.run(Worker.java:152)

So I stopped WAL for the table then resumed. It seems to be working now, but not sure if this is the right workaround or why this happened in the first place

Share Improve this question asked Mar 31 at 17:20 Javier Ramirez 4,0851 gold badge27 silver badges36 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

This type of error is typically seen when there is data corruption, which very often happens after running out of disk (so files are in inconsistent state) or when suffering a hardware or power loss.

The first workaround is typically to just try and use ALTER TABLE .. RESUME WAL, which can be done either wit SQL or graphically from the Web Console when clicking on the suspended table info. If that works, that's the easiest way. In most cases, skipping the faulty transactions can get your table running again.

If that fails, it might be the case a whole column file for a partition is corrupted, and you could try dropping that partition altogether, in the case above the partition corresponding to 2024-12-27.

The workaround of disabling and re-enabling WAL, it is not ideal as it implies restarting the database, and also it will remove all the pending WALs, even if some of those might be recoverable after skipping the problematic transactions.

If everything else, the only solution would be recovering from a backup.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

database - QuestDB ApplyWal2TableJob error and invalid column size - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)