I am using Apache NiFi to consume Kafka messages and load them into CSV files. For intermediate processing, we have developed our own custom processors.
We want to ensure that logging follows a consistent format, whether using a custom processor or a built-in (OOB) processor. The desired log format is:
[%TIME%] [%LEVEL%] [flowfileuuid=?] %MESSAGE%
I am aware that we can use provenance reports to trace flowfiles, but we want to log the flowfile.uuid directly in our log messages.
I initially tried using MDC (Mapped Diagnostic Context), but since MDC is thread-scoped, it works only in custom processors. In custom processors, I can extract flowfile.uuid and set it in MDC like this:
String uuid = flowFile.getAttribute("uuid");
MDC.put("requestId", uuid);
And in logback.xml, I configured the log pattern as:
<pattern>[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%-5level] [request_id=%X{uuid}] %msg%n</pattern>
The Problem The above approach works only for custom processors, but it does not work for NiFi’s built-in processors (e.g., PutFile, UpdateAttribute). Since MDC is not propagated across threads, the flowfile.uuid is missing in logs when using these processors.
The Question Is there a way to configure logback.xml to directly read the flowfile.uuid attribute, without relying on MDC? Or is there any other approach that would work for both custom processors and NiFi's built-in processors?
Any guidance or alternative solutions would be greatly appreciated!