I have a script using psycopg to ingest data in QuestDB, as seen in the docs. After a while I see this error on the logs and the table is marked as suspended
PGConnectionContext error [msg=`Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]`, errno=0]
This was in QuestDB 8.8.2.
I saw the log had extra information complaining about WAL log apply.
2025-02-21T12:03:21.253673Z E i.q.c.w.ApplyWal2TableJob error applying SQL to wal table [table=solana_create_events, sql=UPDATE solana_create_events
set ipfs_data = '{"name": "REDACTED", "symbol": "CRYPTOBALL", "description": "REDACTED ", "image": "redacted", "showName": true, "createdOn": ";}', updated_at = '2025-02-21T11:31:36.522293'::timestamp
where ts = '2024-12-27 00:00:10'
and token = 'redacted'
, error=Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0], errno=0]
2025-02-21T12:03:21.254316Z C i.q.c.w.ApplyWal2TableJob job failed, table suspended [table=solana_create_events~9, seqTxn=30797315, error=
io.questdb.cairo.CairoException: [0] Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]
at io.questdb.cairo.CairoException.instance(CairoException.java:326)
at io.questdb.cairo.CairoException.critical(CairoException.java:68)
at io.questdb.cairo.TableReader.reloadColumnAt(TableReader.java:1319)
at io.questdb.cairo.TableReader.openPartitionColumns(TableReader.java:1015)
at io.questdb.cairo.TableReader.openPartition0(TableReader.java:982)
at io.questdb.cairo.TableReader.openPartition(TableReader.java:459)
at io.questdb.cairo.pool.ReaderPool$R.openPartition(ReaderPool.java:137)
at io.questdb.cairo.IntervalFwdPartitionFrameCursor.next(IntervalFwdPartitionFrameCursor.java:56)
at io.questdb.griffin.engine.table.FwdTableReaderPageFrameCursor.next(FwdTableReaderPageFrameCursor.java:120)
at io.questdb.griffin.engine.table.PageFrameRecordCursorImpl.hasNext(PageFrameRecordCursorImpl.java:111)
at io.questdb.griffin.engine.table.FilteredRecordCursor.hasNext(FilteredRecordCursor.java:66)
at io.questdb.griffin.engine.AbstractVirtualFunctionRecordCursor.hasNext(AbstractVirtualFunctionRecordCursor.java:81)
at io.questdb.griffin.UpdateOperatorImpl.executeUpdate(UpdateOperatorImpl.java:139)
at io.questdb.griffin.engine.ops.UpdateOperation.apply(UpdateOperation.java:74)
at io.questdb.cairo.TableWriter.apply(TableWriter.java:713)
at io.questdb.cairo.wal.OperationExecutor.executeUpdate(OperationExecutor.java:98)
at io.questdb.cairo.wal.ApplyWal2TableJob.processWalSql(ApplyWal2TableJob.java:564)
at io.questdb.cairo.wal.ApplyWal2TableJob.processWalCommit(ApplyWal2TableJob.java:533)
at io.questdb.cairo.wal.ApplyWal2TableJob.applyOutstandingWalTransactions(ApplyWal2TableJob.java:377)
at io.questdb.cairo.wal.ApplyWal2TableJob.applyWal(ApplyWal2TableJob.java:658)
at io.questdb.cairo.wal.ApplyWal2TableJob.doRun(ApplyWal2TableJob.java:707)
at io.questdb.mp.AbstractQueueConsumerJob.run(AbstractQueueConsumerJob.java:50)
at io.questdb.mp.Worker.run(Worker.java:152)
So I stopped WAL for the table then resumed. It seems to be working now, but not sure if this is the right workaround or why this happened in the first place
I have a script using psycopg to ingest data in QuestDB, as seen in the docs. After a while I see this error on the logs and the table is marked as suspended
PGConnectionContext error [msg=`Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]`, errno=0]
This was in QuestDB 8.8.2.
I saw the log had extra information complaining about WAL log apply.
2025-02-21T12:03:21.253673Z E i.q.c.w.ApplyWal2TableJob error applying SQL to wal table [table=solana_create_events, sql=UPDATE solana_create_events
set ipfs_data = '{"name": "REDACTED", "symbol": "CRYPTOBALL", "description": "REDACTED ", "image": "redacted", "showName": true, "createdOn": "https://pump.fun"}', updated_at = '2025-02-21T11:31:36.522293'::timestamp
where ts = '2024-12-27 00:00:10'
and token = 'redacted'
, error=Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0], errno=0]
2025-02-21T12:03:21.254316Z C i.q.c.w.ApplyWal2TableJob job failed, table suspended [table=solana_create_events~9, seqTxn=30797315, error=
io.questdb.cairo.CairoException: [0] Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]
at io.questdb.cairo.CairoException.instance(CairoException.java:326)
at io.questdb.cairo.CairoException.critical(CairoException.java:68)
at io.questdb.cairo.TableReader.reloadColumnAt(TableReader.java:1319)
at io.questdb.cairo.TableReader.openPartitionColumns(TableReader.java:1015)
at io.questdb.cairo.TableReader.openPartition0(TableReader.java:982)
at io.questdb.cairo.TableReader.openPartition(TableReader.java:459)
at io.questdb.cairo.pool.ReaderPool$R.openPartition(ReaderPool.java:137)
at io.questdb.cairo.IntervalFwdPartitionFrameCursor.next(IntervalFwdPartitionFrameCursor.java:56)
at io.questdb.griffin.engine.table.FwdTableReaderPageFrameCursor.next(FwdTableReaderPageFrameCursor.java:120)
at io.questdb.griffin.engine.table.PageFrameRecordCursorImpl.hasNext(PageFrameRecordCursorImpl.java:111)
at io.questdb.griffin.engine.table.FilteredRecordCursor.hasNext(FilteredRecordCursor.java:66)
at io.questdb.griffin.engine.AbstractVirtualFunctionRecordCursor.hasNext(AbstractVirtualFunctionRecordCursor.java:81)
at io.questdb.griffin.UpdateOperatorImpl.executeUpdate(UpdateOperatorImpl.java:139)
at io.questdb.griffin.engine.ops.UpdateOperation.apply(UpdateOperation.java:74)
at io.questdb.cairo.TableWriter.apply(TableWriter.java:713)
at io.questdb.cairo.wal.OperationExecutor.executeUpdate(OperationExecutor.java:98)
at io.questdb.cairo.wal.ApplyWal2TableJob.processWalSql(ApplyWal2TableJob.java:564)
at io.questdb.cairo.wal.ApplyWal2TableJob.processWalCommit(ApplyWal2TableJob.java:533)
at io.questdb.cairo.wal.ApplyWal2TableJob.applyOutstandingWalTransactions(ApplyWal2TableJob.java:377)
at io.questdb.cairo.wal.ApplyWal2TableJob.applyWal(ApplyWal2TableJob.java:658)
at io.questdb.cairo.wal.ApplyWal2TableJob.doRun(ApplyWal2TableJob.java:707)
at io.questdb.mp.AbstractQueueConsumerJob.run(AbstractQueueConsumerJob.java:50)
at io.questdb.mp.Worker.run(Worker.java:152)
So I stopped WAL for the table then resumed. It seems to be working now, but not sure if this is the right workaround or why this happened in the first place
Share Improve this question asked Mar 31 at 17:20 Javier RamirezJavier Ramirez 4,0851 gold badge27 silver badges36 bronze badges1 Answer
Reset to default 0This type of error is typically seen when there is data corruption, which very often happens after running out of disk (so files are in inconsistent state) or when suffering a hardware or power loss.
The first workaround is typically to just try and use ALTER TABLE .. RESUME WAL
, which can be done either wit SQL or graphically from the Web Console when clicking on the suspended table info. If that works, that's the easiest way. In most cases, skipping the faulty transactions can get your table running again.
If that fails, it might be the case a whole column file for a partition is corrupted, and you could try dropping that partition altogether, in the case above the partition corresponding to 2024-12-27
.
The workaround of disabling and re-enabling WAL, it is not ideal as it implies restarting the database, and also it will remove all the pending WALs, even if some of those might be recoverable after skipping the problematic transactions.
If everything else, the only solution would be recovering from a backup.