最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

database - QuestDB ApplyWal2TableJob error and invalid column size - Stack Overflow

programmeradmin3浏览0评论

I have a script using psycopg to ingest data in QuestDB, as seen in the docs. After a while I see this error on the logs and the table is marked as suspended

PGConnectionContext error [msg=`Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]`, errno=0]

This was in QuestDB 8.8.2.

I saw the log had extra information complaining about WAL log apply.

2025-02-21T12:03:21.253673Z E i.q.c.w.ApplyWal2TableJob error applying SQL to wal table [table=solana_create_events, sql=UPDATE solana_create_events
                set ipfs_data = '{"name": "REDACTED", "symbol": "CRYPTOBALL", "description": "REDACTED ", "image": "redacted", "showName": true, "createdOn": ";}', updated_at = '2025-02-21T11:31:36.522293'::timestamp
                where ts = '2024-12-27 00:00:10'
                and token = 'redacted'
                , error=Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0], errno=0]
2025-02-21T12:03:21.254316Z C i.q.c.w.ApplyWal2TableJob job failed, table suspended [table=solana_create_events~9, seqTxn=30797315, error=
io.questdb.cairo.CairoException: [0] Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]
    at io.questdb.cairo.CairoException.instance(CairoException.java:326)
    at io.questdb.cairo.CairoException.critical(CairoException.java:68)
    at io.questdb.cairo.TableReader.reloadColumnAt(TableReader.java:1319)
    at io.questdb.cairo.TableReader.openPartitionColumns(TableReader.java:1015)
    at io.questdb.cairo.TableReader.openPartition0(TableReader.java:982)
    at io.questdb.cairo.TableReader.openPartition(TableReader.java:459)
    at io.questdb.cairo.pool.ReaderPool$R.openPartition(ReaderPool.java:137)
    at io.questdb.cairo.IntervalFwdPartitionFrameCursor.next(IntervalFwdPartitionFrameCursor.java:56)
    at io.questdb.griffin.engine.table.FwdTableReaderPageFrameCursor.next(FwdTableReaderPageFrameCursor.java:120)
    at io.questdb.griffin.engine.table.PageFrameRecordCursorImpl.hasNext(PageFrameRecordCursorImpl.java:111)
    at io.questdb.griffin.engine.table.FilteredRecordCursor.hasNext(FilteredRecordCursor.java:66)
    at io.questdb.griffin.engine.AbstractVirtualFunctionRecordCursor.hasNext(AbstractVirtualFunctionRecordCursor.java:81)
    at io.questdb.griffin.UpdateOperatorImpl.executeUpdate(UpdateOperatorImpl.java:139)
    at io.questdb.griffin.engine.ops.UpdateOperation.apply(UpdateOperation.java:74)
    at io.questdb.cairo.TableWriter.apply(TableWriter.java:713)
    at io.questdb.cairo.wal.OperationExecutor.executeUpdate(OperationExecutor.java:98)
    at io.questdb.cairo.wal.ApplyWal2TableJob.processWalSql(ApplyWal2TableJob.java:564)
    at io.questdb.cairo.wal.ApplyWal2TableJob.processWalCommit(ApplyWal2TableJob.java:533)
    at io.questdb.cairo.wal.ApplyWal2TableJob.applyOutstandingWalTransactions(ApplyWal2TableJob.java:377)
    at io.questdb.cairo.wal.ApplyWal2TableJob.applyWal(ApplyWal2TableJob.java:658)
    at io.questdb.cairo.wal.ApplyWal2TableJob.doRun(ApplyWal2TableJob.java:707)
    at io.questdb.mp.AbstractQueueConsumerJob.run(AbstractQueueConsumerJob.java:50)
    at io.questdb.mp.Worker.run(Worker.java:152)

So I stopped WAL for the table then resumed. It seems to be working now, but not sure if this is the right workaround or why this happened in the first place

I have a script using psycopg to ingest data in QuestDB, as seen in the docs. After a while I see this error on the logs and the table is marked as suspended

PGConnectionContext error [msg=`Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]`, errno=0]

This was in QuestDB 8.8.2.

I saw the log had extra information complaining about WAL log apply.

2025-02-21T12:03:21.253673Z E i.q.c.w.ApplyWal2TableJob error applying SQL to wal table [table=solana_create_events, sql=UPDATE solana_create_events
                set ipfs_data = '{"name": "REDACTED", "symbol": "CRYPTOBALL", "description": "REDACTED ", "image": "redacted", "showName": true, "createdOn": "https://pump.fun"}', updated_at = '2025-02-21T11:31:36.522293'::timestamp
                where ts = '2024-12-27 00:00:10'
                and token = 'redacted'
                , error=Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0], errno=0]
2025-02-21T12:03:21.254316Z C i.q.c.w.ApplyWal2TableJob job failed, table suspended [table=solana_create_events~9, seqTxn=30797315, error=
io.questdb.cairo.CairoException: [0] Invalid column size [column=/var/lib/questdb/db/solana_create_events~9/2024-12-27.867943/ipfs_data.i.31430057, size=0]
    at io.questdb.cairo.CairoException.instance(CairoException.java:326)
    at io.questdb.cairo.CairoException.critical(CairoException.java:68)
    at io.questdb.cairo.TableReader.reloadColumnAt(TableReader.java:1319)
    at io.questdb.cairo.TableReader.openPartitionColumns(TableReader.java:1015)
    at io.questdb.cairo.TableReader.openPartition0(TableReader.java:982)
    at io.questdb.cairo.TableReader.openPartition(TableReader.java:459)
    at io.questdb.cairo.pool.ReaderPool$R.openPartition(ReaderPool.java:137)
    at io.questdb.cairo.IntervalFwdPartitionFrameCursor.next(IntervalFwdPartitionFrameCursor.java:56)
    at io.questdb.griffin.engine.table.FwdTableReaderPageFrameCursor.next(FwdTableReaderPageFrameCursor.java:120)
    at io.questdb.griffin.engine.table.PageFrameRecordCursorImpl.hasNext(PageFrameRecordCursorImpl.java:111)
    at io.questdb.griffin.engine.table.FilteredRecordCursor.hasNext(FilteredRecordCursor.java:66)
    at io.questdb.griffin.engine.AbstractVirtualFunctionRecordCursor.hasNext(AbstractVirtualFunctionRecordCursor.java:81)
    at io.questdb.griffin.UpdateOperatorImpl.executeUpdate(UpdateOperatorImpl.java:139)
    at io.questdb.griffin.engine.ops.UpdateOperation.apply(UpdateOperation.java:74)
    at io.questdb.cairo.TableWriter.apply(TableWriter.java:713)
    at io.questdb.cairo.wal.OperationExecutor.executeUpdate(OperationExecutor.java:98)
    at io.questdb.cairo.wal.ApplyWal2TableJob.processWalSql(ApplyWal2TableJob.java:564)
    at io.questdb.cairo.wal.ApplyWal2TableJob.processWalCommit(ApplyWal2TableJob.java:533)
    at io.questdb.cairo.wal.ApplyWal2TableJob.applyOutstandingWalTransactions(ApplyWal2TableJob.java:377)
    at io.questdb.cairo.wal.ApplyWal2TableJob.applyWal(ApplyWal2TableJob.java:658)
    at io.questdb.cairo.wal.ApplyWal2TableJob.doRun(ApplyWal2TableJob.java:707)
    at io.questdb.mp.AbstractQueueConsumerJob.run(AbstractQueueConsumerJob.java:50)
    at io.questdb.mp.Worker.run(Worker.java:152)

So I stopped WAL for the table then resumed. It seems to be working now, but not sure if this is the right workaround or why this happened in the first place

Share Improve this question asked Mar 31 at 17:20 Javier RamirezJavier Ramirez 4,0851 gold badge27 silver badges36 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

This type of error is typically seen when there is data corruption, which very often happens after running out of disk (so files are in inconsistent state) or when suffering a hardware or power loss.

The first workaround is typically to just try and use ALTER TABLE .. RESUME WAL, which can be done either wit SQL or graphically from the Web Console when clicking on the suspended table info. If that works, that's the easiest way. In most cases, skipping the faulty transactions can get your table running again.

If that fails, it might be the case a whole column file for a partition is corrupted, and you could try dropping that partition altogether, in the case above the partition corresponding to 2024-12-27.

The workaround of disabling and re-enabling WAL, it is not ideal as it implies restarting the database, and also it will remove all the pending WALs, even if some of those might be recoverable after skipping the problematic transactions.

If everything else, the only solution would be recovering from a backup.

发布评论

评论列表(0)

  1. 暂无评论