最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

java - Running jcmd JFR.dump seems to stall the JVM. How can I fix this? - Stack Overflow

programmeradmin2浏览0评论

Flight recording is enabled with disk=false when the service starts and a timer service triggers jcmd <PID> JFR.dump every five minutes to capture flight recordings to a file.

However we notice performance issues that coincide with the execution of that command. This manifests in my Vert.x based service as:

2025-02-05 17:00:39.998 [33] WARN  io.vertx.core.impl.BlockedThreadChecker - Thread Thread[#34,vert.x-eventloop-thread-0,5,main] has been blocked for 3154 ms, time limit is 2000 ms

It doesn't trigger that log every time, but every time I see that log it matches when the JFR.dump ran.

The periodic process doesn't do much. The service is running on an AWS EC2 instance. There are a couple curl calls to get details about the instance, then it makes the JFR file and copies it to S3. The main jcmd is simply:

jcmd $PID JFR.dump name=jfr filename=$JFR_FILE_LOCATION > /dev/null && log_message "JFR dump succeeded" || log_message "JFR dump failed"

Tried skipping the copy to S3, thinking it might slow AWS API calls or something to stall the main thread, it didn't help. Tried using ZGC just in case, but it didn't help. (Heap statistics are not enabled.) I didn't think flight recordings should stall the JVM in such a significant way. Is JFR.dump known to have this effect? How can I fix it?

Flight recording is enabled with disk=false when the service starts and a timer service triggers jcmd <PID> JFR.dump every five minutes to capture flight recordings to a file.

However we notice performance issues that coincide with the execution of that command. This manifests in my Vert.x based service as:

2025-02-05 17:00:39.998 [33] WARN  io.vertx.core.impl.BlockedThreadChecker - Thread Thread[#34,vert.x-eventloop-thread-0,5,main] has been blocked for 3154 ms, time limit is 2000 ms

It doesn't trigger that log every time, but every time I see that log it matches when the JFR.dump ran.

The periodic process doesn't do much. The service is running on an AWS EC2 instance. There are a couple curl calls to get details about the instance, then it makes the JFR file and copies it to S3. The main jcmd is simply:

jcmd $PID JFR.dump name=jfr filename=$JFR_FILE_LOCATION > /dev/null && log_message "JFR dump succeeded" || log_message "JFR dump failed"

Tried skipping the copy to S3, thinking it might slow AWS API calls or something to stall the main thread, it didn't help. Tried using ZGC just in case, but it didn't help. (Heap statistics are not enabled.) I didn't think flight recordings should stall the JVM in such a significant way. Is JFR.dump known to have this effect? How can I fix it?

Share Improve this question asked Feb 7 at 14:16 swpalmerswpalmer 4,3812 gold badges28 silver badges35 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

What stands out in your configuration is disk=false. Oracle's long running stress testing of JFR has been with disk=true, which is the default.

When you run with disk=false, stack trace data is not flushed out, and it could accumulate over time, leading to bugs that are hard to notice in short-lived unit tests or observed when running manually during development.

发布评论

评论列表(0)

  1. 暂无评论