最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Go writing log file weirdly stuck in netpoll - Stack Overflow

programmeradmin4浏览0评论

Researching strange issue with weird delays in the app took me to the line where logging is called. At that line data received from communication channel are simply dumped to the log on trace level (they are text, or rather xml). Delay occurs when data to be dumped are of considerable size (unintentionally of course) - e.g. 100+ kilobytes. Delay is surprisingly of precisely 120-125 seconds and it seemingly do not depend on exact data size (e.g. 100kb or 300kb do not matter).

It uses zap logger under the hood and I found it simply stuck on underlying (*os.File) Write method.

Killing the app with kill -3 <pid> we collect stacktrace containing the following (I believe it is go v1.22.5 to match line numbers):

goroutine 660 gp=0xc000603180 m=nil [IO wait]:
runtime.gopark(0xc0002be1a0?, 0x10101139100?, 0x80?, 0x92?, 0xb?)
    /usr/lib/go/src/runtime/proc.go:402 +0xce fp=0xc0006515e0 sp=0xc0006515c0 pc=0x44358e
runtimepollblock(0x48cf3b?, 0x40b146?, 0x0?)
    /usr/lib/go/src/runtime/netpoll.go:573 +0xf7 fp=0xc000651618 sp=0xc0006515e0 pc=0x43c1d7
internal/poll.runtime_pollWait(0x7bca99ea4d48, 0x77)
    /usr/lib/go/src/runtime/netpoll.go:345 +0x85 fp=0xc000651638 sp=0xc000651618 pc=0x4714e5
internal/poll.(*pollDesc).wait(0xc000059320?, 0xc000dde000?, 0x1)
    /usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000651660 sp=0xc000651638 pc=0x4ef447
internal/poll.(*pollDesc).waitWrite(...)
    /usr/lib/go/src/internal/poll/fd_poll_runtime.go:93
internal/poll.(*FD).Write(0xc000059320, {0xc000dbe000, 0x351d7, 0x36000})
    /usr/lib/go/src/internal/poll/fd_unix.go:388 +0x2d9 fp=0xc000651710 sp=0xc000651660 pc=0x4f27d9
os.(*File).write(...)
    /usr/lib/go/src/os/file_posix.go:46
os.(*File).Write(0xc00078e210, {0xc000dbe000?, 0x351d7, 0xc0003c6a40?})
    /usr/lib/go/src/os/file.go:189 +0x51 fp=0xc000651770 sp=0xc000651710 pc=0x4fc031
go.uber/zap/zapcore.(*lockedWriteSyncer).Write(0xc0000ea0a8, {0xc000dbe000?, 0x2b06a37fd?, 0x200a460?})
    /home/rodion/my/oam-agent/out/.cache/go-pkg/go.uber/[email protected]/zapcore/write_syncer.go:66 +0x6c fp=0xc0006517b8 sp=0xc000651770 pc=0xe1956c
go.uber/zap/zapcore.(*ioCore).Write(0xc00048f260, {0xfc, {0xc1e40920854e5dcd, 0x2b06a37fd, 0x200a460}, {0x0, 0x0}, {0xc000d88000, 0x351b2}, {0x0, ...}, ...}, ...)
    /home/rodion/my/oam-agent/out/.cache/go-pkg/go.uber/[email protected]/zapcore/core.go:99 +0xb5 fp=0xc000651888 sp=0xc0006517b8 pc=0xe0c975
go.uber/zap/zapcore.(*CheckedEntry).Write(0xc0009a00d0, {0x0, 0x0, 0x0})
    /home/rodion/my/oam-agent/out/.cache/go-pkg/go.uber/[email protected]/zapcore/entry.go:253 +0x11c fp=0xc000651a18 sp=0xc000651888 pc=0xe0ebdc

I don't see or understand the reason this netpoll is waiting in this situation and how this could be related to the data size passed to Write. Thus any explanations, hints and advice on further debugging is welcome. Not sure if it matters, but the app is executed in docker (64 bit amd linux).

It is some custom built linux I believe so there is a small chance something is unusual or buggy about it - in which case it would be good to think of some way to figure this out too.

发布评论

评论列表(0)

  1. 暂无评论