Problem
I'm using Airflow 2.9.0 with OpenTelemetry metrics integration, but task instance state metrics like ti.start
aren't appearing in Prometheus, despite task instance creation metrics showing up correctly.
My Setup
I have the following components:
- Airflow 2.9.0 with Celery executor
- OpenTelemetry collector
- Prometheus for metrics storage
- Grafana for visualization
Docker Compose Configuration (relevant parts)
services:
webserver:
environment:
- AIRFLOW__METRICS__OTEL_ON=True
- AIRFLOW__METRICS__STATSD_ON=False
- AIRFLOW__METRICS__OTEL_HOST=otel-collector
- AIRFLOW__METRICS__OTEL_PORT=4318
- AIRFLOW__METRICS__OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
- AIRFLOW__METRICS__OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
- AIRFLOW__METRICS__OTEL_RESOURCE_ATTRIBUTES=service.name=airflow
- AIRFLOW__METRICS__OTEL_METRICS_EXPORTER=otlp
- AIRFLOW__METRICS__METRICS_USE_PATTERN_MATCH=True
- AIRFLOW__METRICS__METRICS_OTEL_SSL_ACTIVE=False
- AIRFLOW__METRICS__OTEL_DEBUGGING_ON=True
- AIRFLOW__METRICS__METRICS_ALLOW_LIST=ti|task
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
command: ["--config=/etc/otel-collector-config.yaml", "--set=service.telemetry.logs.level=debug"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
OpenTelemetry Collector Config
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
processors:
batch:
filter:
metrics:
include:
match_type: regexp
metric_names:
- "task_instance.*"
- "airflow.*task.*state.*"
- "airflow.*task.*status.*"
exporters:
prometheus:
endpoint: 0.0.0.0:8889
service:
pipelines:
metrics:
receivers: [otlp]
processors: [filter, batch]
exporters: [prometheus]
What I've Found So Far
I can see task instance creation metrics in Prometheus:
airflow_task_instance_created_pythonoperator_total
airflow_task_instance_created_dbtoperator_total
airflow_task_instance_created_total
What I've Tried
Enabled debugging
in the OTel collector
Checked the Airflow logs for any errors related to metrics
Verified my METRICS_ALLOW_LIST
includes ti|task
Inspected the metrics data in the collector logs
Questions
- Why aren't task instance state metrics like ti.start and state transitions appearing in Prometheus despite other task instance metrics showing up?
- What might be filtering out these specific state-related metrics?
- Is there a specific configuration I'm missing to capture task instance state metrics?
- How can I verify that these metrics are actually being sent to the collector?
Any insights would be greatly appreciated!
Problem
I'm using Airflow 2.9.0 with OpenTelemetry metrics integration, but task instance state metrics like ti.start
aren't appearing in Prometheus, despite task instance creation metrics showing up correctly.
My Setup
I have the following components:
- Airflow 2.9.0 with Celery executor
- OpenTelemetry collector
- Prometheus for metrics storage
- Grafana for visualization
Docker Compose Configuration (relevant parts)
services:
webserver:
environment:
- AIRFLOW__METRICS__OTEL_ON=True
- AIRFLOW__METRICS__STATSD_ON=False
- AIRFLOW__METRICS__OTEL_HOST=otel-collector
- AIRFLOW__METRICS__OTEL_PORT=4318
- AIRFLOW__METRICS__OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
- AIRFLOW__METRICS__OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
- AIRFLOW__METRICS__OTEL_RESOURCE_ATTRIBUTES=service.name=airflow
- AIRFLOW__METRICS__OTEL_METRICS_EXPORTER=otlp
- AIRFLOW__METRICS__METRICS_USE_PATTERN_MATCH=True
- AIRFLOW__METRICS__METRICS_OTEL_SSL_ACTIVE=False
- AIRFLOW__METRICS__OTEL_DEBUGGING_ON=True
- AIRFLOW__METRICS__METRICS_ALLOW_LIST=ti|task
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
command: ["--config=/etc/otel-collector-config.yaml", "--set=service.telemetry.logs.level=debug"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
OpenTelemetry Collector Config
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
processors:
batch:
filter:
metrics:
include:
match_type: regexp
metric_names:
- "task_instance.*"
- "airflow.*task.*state.*"
- "airflow.*task.*status.*"
exporters:
prometheus:
endpoint: 0.0.0.0:8889
service:
pipelines:
metrics:
receivers: [otlp]
processors: [filter, batch]
exporters: [prometheus]
What I've Found So Far
I can see task instance creation metrics in Prometheus:
airflow_task_instance_created_pythonoperator_total
airflow_task_instance_created_dbtoperator_total
airflow_task_instance_created_total
What I've Tried
Enabled debugging
in the OTel collector
Checked the Airflow logs for any errors related to metrics
Verified my METRICS_ALLOW_LIST
includes ti|task
Inspected the metrics data in the collector logs
Questions
- Why aren't task instance state metrics like ti.start and state transitions appearing in Prometheus despite other task instance metrics showing up?
- What might be filtering out these specific state-related metrics?
- Is there a specific configuration I'm missing to capture task instance state metrics?
- How can I verify that these metrics are actually being sent to the collector?
Any insights would be greatly appreciated!
Share Improve this question asked Mar 20 at 9:26 RSNboimRSNboim 1401 silver badge9 bronze badges1 Answer
Reset to default 1Have faced the same issue in Kubernetes. Solved by setting explicitly
AIRFLOW__METRICS__STATSD_ON: "False"