Add flag evaluation metrics via OTel counter and OpenFeature Hook by typotter · Pull Request #11040 · DataDog/dd-trace-java

typotter · 2026-04-02T15:34:34Z

What Does This Do

Records a feature_flag.evaluations OTel counter metric on every flag evaluation via an OpenFeature finallyAfter hook. The hook captures all evaluation paths including type mismatches that occur above the provider level in the OpenFeature SDK pipeline.

Creates a dedicated SdkMeterProvider with an OtlpHttpMetricExporter that sends metrics directly to the DD Agent's OTLP endpoint (/v1/metrics). This avoids the agent's OTel class shading (io.opentelemetry.api.* → datadog.trace.bootstrap.otel.api.*) which prevents using GlobalOpenTelemetry from the published dd-openfeature jar.

Metric attributes:

Attribute	When present	Value
`feature_flag.key`	Always	Flag key
`feature_flag.result.variant`	Always	Variant key (empty string if null)
`feature_flag.result.reason`	Always	Reason lowercased
`error.type`	On error	ErrorCode lowercased
`feature_flag.result.allocation_key`	When present	Allocation key from flag metadata

New files: FlagEvalMetrics.java, FlagEvalHook.java, FlagEvalMetricsTest.java, FlagEvalHookTest.java
Modified files: Provider.java (adds getProviderHooks()), ProviderTest.java, build.gradle.kts

Motivation

Evaluation metrics allow tracking how many times flags are evaluated, with which results, across sessions. This is the Java implementation of the evaluation logging spec (FFL-1942), matching the existing Python (dd-trace-py#17029) and Go (dd-trace-go#4489) implementations.

System tests: 11/17 pass. The 6 remaining failures are pre-existing DDEvaluator gaps (reason mapping, parse error codes) addressed in separate PRs (#11036, #10971).

References:

Python implementation: feat(openfeature): add flag evaluation metrics dd-trace-py#17029
Go implementation: feat(openfeature): add flag evaluation tracking via OTel Metrics dd-trace-go#4489
Go cross-tracer consistency: fix(openfeature): improve FFE eval metrics cross-tracer consistency dd-trace-go#4590
System tests: DataDog/system-tests (branch sameerank/FFL-1942/add-flag-eval-metrics)

Additional Notes

OTel SDK dependencies (opentelemetry-sdk-metrics, opentelemetry-exporter-otlp) are compileOnly — applications must include them on the classpath for metrics to flow. Falls back to silent no-op when absent.
Export interval: 10s (matching Go SDK and EVALLOG.4 spec)
Endpoint resolution follows OTel spec: OTEL_EXPORTER_OTLP_METRICS_ENDPOINT → OTEL_EXPORTER_OTLP_ENDPOINT + /v1/metrics → http://localhost:4318/v1/metrics

Contributor Checklist

Format the title according to the contribution guidelines
Assign the type: and (comp: or inst:) labels
Avoid using close, fix, or any linking keywords when referencing an issue

Jira ticket: FFL-1942

pr-commenter · 2026-04-08T20:15:44Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	typo/evaluations-logging
git_commit_date	1776183543	1776183132
git_commit_sha	f89a0b26cc	`93af7a8`
release_version	1.62.0-SNAPSHOT~9f89a0b26cc	1.62.0-SNAPSHOT~93af7a8bd4

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1776184837	1776184837
ci_job_id	1594357445	1594357445
ci_pipeline_id	107638013	107638013
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-tm8gb8ws 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-tm8gb8ws 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 13 unstable metrics.

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.055 s) : 0, 1054701
Total [baseline] (8.838 s) : 0, 8837845
Agent [candidate] (1.056 s) : 0, 1056172
Total [candidate] (8.845 s) : 0, 8844645
section iast
Agent [baseline] (1.225 s) : 0, 1225204
Total [baseline] (9.579 s) : 0, 9579185
Agent [candidate] (1.225 s) : 0, 1224723
Total [candidate] (9.579 s) : 0, 9578576

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.055 s	-
Agent	iast	1.225 s	170.503 ms (16.2%)
Total	tracing	8.838 s	-
Total	iast	9.579 s	741.341 ms (8.4%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.056 s	-
Agent	iast	1.225 s	168.551 ms (16.0%)
Total	tracing	8.845 s	-
Total	iast	9.579 s	733.931 ms (8.3%)

gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.228 ms) : 0, 1228
crashtracking [candidate] (1.209 ms) : 0, 1209
BytebuddyAgent [baseline] (632.642 ms) : 0, 632642
BytebuddyAgent [candidate] (632.853 ms) : 0, 632853
AgentMeter [baseline] (29.478 ms) : 0, 29478
AgentMeter [candidate] (29.328 ms) : 0, 29328
GlobalTracer [baseline] (248.626 ms) : 0, 248626
GlobalTracer [candidate] (248.492 ms) : 0, 248492
AppSec [baseline] (32.06 ms) : 0, 32060
AppSec [candidate] (32.1 ms) : 0, 32100
Debugger [baseline] (59.155 ms) : 0, 59155
Debugger [candidate] (59.319 ms) : 0, 59319
Remote Config [baseline] (597.5 µs) : 0, 598
Remote Config [candidate] (592.601 µs) : 0, 593
Telemetry [baseline] (8.069 ms) : 0, 8069
Telemetry [candidate] (8.096 ms) : 0, 8096
Flare Poller [baseline] (6.769 ms) : 0, 6769
Flare Poller [candidate] (8.14 ms) : 0, 8140
section iast
crashtracking [baseline] (1.227 ms) : 0, 1227
crashtracking [candidate] (1.227 ms) : 0, 1227
BytebuddyAgent [baseline] (801.631 ms) : 0, 801631
BytebuddyAgent [candidate] (802.325 ms) : 0, 802325
AgentMeter [baseline] (11.402 ms) : 0, 11402
AgentMeter [candidate] (11.419 ms) : 0, 11419
GlobalTracer [baseline] (239.371 ms) : 0, 239371
GlobalTracer [candidate] (239.338 ms) : 0, 239338
IAST [baseline] (25.892 ms) : 0, 25892
IAST [candidate] (25.864 ms) : 0, 25864
AppSec [baseline] (28.785 ms) : 0, 28785
AppSec [candidate] (30.229 ms) : 0, 30229
Debugger [baseline] (63.178 ms) : 0, 63178
Debugger [candidate] (62.446 ms) : 0, 62446
Remote Config [baseline] (1.164 ms) : 0, 1164
Remote Config [candidate] (536.498 µs) : 0, 536
Telemetry [baseline] (12.846 ms) : 0, 12846
Telemetry [candidate] (11.477 ms) : 0, 11477
Flare Poller [baseline] (3.415 ms) : 0, 3415
Flare Poller [candidate] (3.558 ms) : 0, 3558

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.056 s) : 0, 1056136
Total [baseline] (11.171 s) : 0, 11170559
Agent [candidate] (1.055 s) : 0, 1055112
Total [candidate] (11.069 s) : 0, 11068635
section appsec
Agent [baseline] (1.249 s) : 0, 1249109
Total [baseline] (11.125 s) : 0, 11125396
Agent [candidate] (1.247 s) : 0, 1246854
Total [candidate] (11.064 s) : 0, 11063540
section iast
Agent [baseline] (1.243 s) : 0, 1242623
Total [baseline] (11.297 s) : 0, 11296861
Agent [candidate] (1.23 s) : 0, 1229872
Total [candidate] (11.267 s) : 0, 11267069
section profiling
Agent [baseline] (1.187 s) : 0, 1187372
Total [baseline] (11.169 s) : 0, 11168511
Agent [candidate] (1.182 s) : 0, 1182329
Total [candidate] (11.051 s) : 0, 11050685

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.056 s	-
Agent	appsec	1.249 s	192.973 ms (18.3%)
Agent	iast	1.243 s	186.487 ms (17.7%)
Agent	profiling	1.187 s	131.236 ms (12.4%)
Total	tracing	11.171 s	-
Total	appsec	11.125 s	-45.163 ms (-0.4%)
Total	iast	11.297 s	126.302 ms (1.1%)
Total	profiling	11.169 s	-2.048 ms (-0.0%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.055 s	-
Agent	appsec	1.247 s	191.742 ms (18.2%)
Agent	iast	1.23 s	174.759 ms (16.6%)
Agent	profiling	1.182 s	127.217 ms (12.1%)
Total	tracing	11.069 s	-
Total	appsec	11.064 s	-5.094 ms (-0.0%)
Total	iast	11.267 s	198.434 ms (1.8%)
Total	profiling	11.051 s	-17.949 ms (-0.2%)

gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.231 ms) : 0, 1231
crashtracking [candidate] (1.217 ms) : 0, 1217
BytebuddyAgent [baseline] (632.767 ms) : 0, 632767
BytebuddyAgent [candidate] (631.1 ms) : 0, 631100
AgentMeter [baseline] (29.43 ms) : 0, 29430
AgentMeter [candidate] (29.416 ms) : 0, 29416
GlobalTracer [baseline] (249.195 ms) : 0, 249195
GlobalTracer [candidate] (248.948 ms) : 0, 248948
AppSec [baseline] (31.894 ms) : 0, 31894
AppSec [candidate] (32.057 ms) : 0, 32057
Debugger [baseline] (60.094 ms) : 0, 60094
Debugger [candidate] (60.149 ms) : 0, 60149
Remote Config [baseline] (596.779 µs) : 0, 597
Remote Config [candidate] (609.218 µs) : 0, 609
Telemetry [baseline] (8.1 ms) : 0, 8100
Telemetry [candidate] (8.163 ms) : 0, 8163
Flare Poller [baseline] (6.717 ms) : 0, 6717
Flare Poller [candidate] (7.34 ms) : 0, 7340
section appsec
crashtracking [baseline] (1.226 ms) : 0, 1226
crashtracking [candidate] (1.214 ms) : 0, 1214
BytebuddyAgent [baseline] (661.92 ms) : 0, 661920
BytebuddyAgent [candidate] (661.223 ms) : 0, 661223
AgentMeter [baseline] (12.056 ms) : 0, 12056
AgentMeter [candidate] (12.078 ms) : 0, 12078
GlobalTracer [baseline] (249.511 ms) : 0, 249511
GlobalTracer [candidate] (248.932 ms) : 0, 248932
IAST [baseline] (24.618 ms) : 0, 24618
IAST [candidate] (24.574 ms) : 0, 24574
AppSec [baseline] (184.913 ms) : 0, 184913
AppSec [candidate] (183.915 ms) : 0, 183915
Debugger [baseline] (65.792 ms) : 0, 65792
Debugger [candidate] (65.715 ms) : 0, 65715
Remote Config [baseline] (629.984 µs) : 0, 630
Remote Config [candidate] (597.85 µs) : 0, 598
Telemetry [baseline] (8.582 ms) : 0, 8582
Telemetry [candidate] (8.69 ms) : 0, 8690
Flare Poller [baseline] (3.559 ms) : 0, 3559
Flare Poller [candidate] (3.566 ms) : 0, 3566
section iast
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.235 ms) : 0, 1235
BytebuddyAgent [baseline] (814.562 ms) : 0, 814562
BytebuddyAgent [candidate] (805.386 ms) : 0, 805386
AgentMeter [baseline] (11.66 ms) : 0, 11660
AgentMeter [candidate] (11.449 ms) : 0, 11449
GlobalTracer [baseline] (242.232 ms) : 0, 242232
GlobalTracer [candidate] (240.501 ms) : 0, 240501
IAST [baseline] (26.233 ms) : 0, 26233
IAST [candidate] (26.718 ms) : 0, 26718
AppSec [baseline] (31.186 ms) : 0, 31186
AppSec [candidate] (30.881 ms) : 0, 30881
Debugger [baseline] (60.677 ms) : 0, 60677
Debugger [candidate] (61.309 ms) : 0, 61309
Remote Config [baseline] (525.706 µs) : 0, 526
Remote Config [candidate] (525.674 µs) : 0, 526
Telemetry [baseline] (13.537 ms) : 0, 13537
Telemetry [candidate] (11.911 ms) : 0, 11911
Flare Poller [baseline] (3.55 ms) : 0, 3550
Flare Poller [candidate] (3.514 ms) : 0, 3514
section profiling
crashtracking [baseline] (1.187 ms) : 0, 1187
crashtracking [candidate] (1.188 ms) : 0, 1188
BytebuddyAgent [baseline] (693.378 ms) : 0, 693378
BytebuddyAgent [candidate] (690.17 ms) : 0, 690170
AgentMeter [baseline] (9.128 ms) : 0, 9128
AgentMeter [candidate] (9.065 ms) : 0, 9065
GlobalTracer [baseline] (207.772 ms) : 0, 207772
GlobalTracer [candidate] (206.849 ms) : 0, 206849
AppSec [baseline] (32.631 ms) : 0, 32631
AppSec [candidate] (32.53 ms) : 0, 32530
Debugger [baseline] (65.809 ms) : 0, 65809
Debugger [candidate] (65.561 ms) : 0, 65561
Remote Config [baseline] (569.285 µs) : 0, 569
Remote Config [candidate] (579.628 µs) : 0, 580
Telemetry [baseline] (7.871 ms) : 0, 7871
Telemetry [candidate] (7.846 ms) : 0, 7846
Flare Poller [baseline] (3.572 ms) : 0, 3572
Flare Poller [candidate] (3.594 ms) : 0, 3594
ProfilingAgent [baseline] (94.043 ms) : 0, 94043
ProfilingAgent [candidate] (93.834 ms) : 0, 93834
Profiling [baseline] (94.604 ms) : 0, 94604
Profiling [candidate] (94.39 ms) : 0, 94390

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	typo/evaluations-logging
git_commit_date	1776183543	1776183132
git_commit_sha	f89a0b26cc	`93af7a8`
release_version	1.62.0-SNAPSHOT~9f89a0b26cc	1.62.0-SNAPSHOT~93af7a8bd4

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1776185317	1776185317
ci_job_id	1594357448	1594357448
ci_pipeline_id	107638013	107638013
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-6mf245k6 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-6mf245k6 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 3 performance improvements and 1 performance regressions! Performance is the same for 17 metrics, 15 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:insecure-bank:iast:high_load	worse [+52.116µs; +125.929µs] or [+2.001%; +4.836%]	unsure [+26.944µs; +388.657µs] or [+0.354%; +5.108%]	unstable [-183.362op/s; +89.425op/s] or [-13.393%; +6.532%]	2.693ms	7.817ms	1322.156op/s	2.604ms	7.609ms	1369.125op/s
scenario:load:petclinic:profiling:high_load	better [-973.056µs; -488.906µs] or [-5.230%; -2.628%]	unsure [-1539.062µs; -170.761µs] or [-5.128%; -0.569%]	unstable [-16.683op/s; +35.433op/s] or [-6.757%; +14.351%]	17.874ms	29.160ms	256.281op/s	18.605ms	30.014ms	246.906op/s
scenario:load:petclinic:no_agent:high_load	better [-2.950ms; -1.817ms] or [-15.737%; -9.693%]	better [-5.137ms; -2.407ms] or [-16.339%; -7.656%]	unstable [+3.383op/s; +60.304op/s] or [+1.390%; +24.769%]	16.362ms	27.666ms	275.312op/s	18.745ms	31.438ms	243.469op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.173 ms) : 18977, 19368
.   : milestone, 19173,
appsec (18.798 ms) : 18609, 18987
.   : milestone, 18798,
code_origins (17.942 ms) : 17768, 18116
.   : milestone, 17942,
iast (17.878 ms) : 17699, 18057
.   : milestone, 17878,
profiling (18.902 ms) : 18712, 19091
.   : milestone, 18902,
tracing (17.913 ms) : 17733, 18093
.   : milestone, 17913,
section candidate
no_agent (16.942 ms) : 16778, 17106
.   : milestone, 16942,
appsec (18.954 ms) : 18763, 19144
.   : milestone, 18954,
code_origins (18.036 ms) : 17859, 18213
.   : milestone, 18036,
iast (17.909 ms) : 17734, 18085
.   : milestone, 17909,
profiling (18.207 ms) : 18029, 18385
.   : milestone, 18207,
tracing (18.279 ms) : 18098, 18460
.   : milestone, 18279,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	19.173 ms [18.977 ms, 19.368 ms]	-
appsec	18.798 ms [18.609 ms, 18.987 ms]	-374.583 µs (-2.0%)
code_origins	17.942 ms [17.768 ms, 18.116 ms]	-1.23 ms (-6.4%)
iast	17.878 ms [17.699 ms, 18.057 ms]	-1.294 ms (-6.8%)
profiling	18.902 ms [18.712 ms, 19.091 ms]	-271.316 µs (-1.4%)
tracing	17.913 ms [17.733 ms, 18.093 ms]	-1.26 ms (-6.6%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	16.942 ms [16.778 ms, 17.106 ms]	-
appsec	18.954 ms [18.763 ms, 19.144 ms]	2.012 ms (11.9%)
code_origins	18.036 ms [17.859 ms, 18.213 ms]	1.094 ms (6.5%)
iast	17.909 ms [17.734 ms, 18.085 ms]	967.383 µs (5.7%)
profiling	18.207 ms [18.029 ms, 18.385 ms]	1.265 ms (7.5%)
tracing	18.279 ms [18.098 ms, 18.46 ms]	1.337 ms (7.9%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.252 ms) : 1239, 1264
.   : milestone, 1252,
iast (3.344 ms) : 3294, 3393
.   : milestone, 3344,
iast_FULL (6.204 ms) : 6139, 6269
.   : milestone, 6204,
iast_GLOBAL (3.638 ms) : 3584, 3692
.   : milestone, 3638,
profiling (2.439 ms) : 2415, 2462
.   : milestone, 2439,
tracing (1.922 ms) : 1906, 1938
.   : milestone, 1922,
section candidate
no_agent (1.247 ms) : 1235, 1260
.   : milestone, 1247,
iast (3.465 ms) : 3414, 3515
.   : milestone, 3465,
iast_FULL (6.197 ms) : 6133, 6260
.   : milestone, 6197,
iast_GLOBAL (3.612 ms) : 3561, 3663
.   : milestone, 3612,
profiling (2.315 ms) : 2292, 2338
.   : milestone, 2315,
tracing (1.928 ms) : 1912, 1944
.   : milestone, 1928,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.252 ms [1.239 ms, 1.264 ms]	-
iast	3.344 ms [3.294 ms, 3.393 ms]	2.092 ms (167.2%)
iast_FULL	6.204 ms [6.139 ms, 6.269 ms]	4.952 ms (395.7%)
iast_GLOBAL	3.638 ms [3.584 ms, 3.692 ms]	2.386 ms (190.6%)
profiling	2.439 ms [2.415 ms, 2.462 ms]	1.187 ms (94.8%)
tracing	1.922 ms [1.906 ms, 1.938 ms]	670.48 µs (53.6%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.247 ms [1.235 ms, 1.26 ms]	-
iast	3.465 ms [3.414 ms, 3.515 ms]	2.217 ms (177.8%)
iast_FULL	6.197 ms [6.133 ms, 6.26 ms]	4.95 ms (396.9%)
iast_GLOBAL	3.612 ms [3.561 ms, 3.663 ms]	2.365 ms (189.6%)
profiling	2.315 ms [2.292 ms, 2.338 ms]	1.068 ms (85.6%)
tracing	1.928 ms [1.912 ms, 1.944 ms]	681.116 µs (54.6%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	typo/evaluations-logging
git_commit_date	1776183642	1776183132
git_commit_sha	f89a0b26cc	`93af7a8`
release_version	1.62.0-SNAPSHOT~9f89a0b26cc	1.62.0-SNAPSHOT~93af7a8bd4

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1776185152	1776185152
ci_job_id	1594357450	1594357450
ci_pipeline_id	107638013	107638013
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-49kqbmud 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-49kqbmud 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 0 unstable metrics.

scenario	Δ mean execution_time	candidate mean execution_time	baseline mean execution_time
scenario:dacapo:tomcat:appsec	better [-1.427ms; -1.082ms] or [-37.408%; -28.371%]	2.559ms	3.813ms

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.5 ms) : 1488, 1511
.   : milestone, 1500,
appsec (3.813 ms) : 3594, 4033
.   : milestone, 3813,
iast (2.294 ms) : 2224, 2363
.   : milestone, 2294,
iast_GLOBAL (2.344 ms) : 2274, 2414
.   : milestone, 2344,
profiling (2.12 ms) : 2064, 2175
.   : milestone, 2120,
tracing (2.097 ms) : 2043, 2151
.   : milestone, 2097,
section candidate
no_agent (1.502 ms) : 1490, 1514
.   : milestone, 1502,
appsec (2.559 ms) : 2504, 2614
.   : milestone, 2559,
iast (2.282 ms) : 2213, 2351
.   : milestone, 2282,
iast_GLOBAL (2.352 ms) : 2282, 2422
.   : milestone, 2352,
profiling (2.112 ms) : 2057, 2167
.   : milestone, 2112,
tracing (2.098 ms) : 2045, 2152
.   : milestone, 2098,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.5 ms [1.488 ms, 1.511 ms]	-
appsec	3.813 ms [3.594 ms, 4.033 ms]	2.314 ms (154.3%)
iast	2.294 ms [2.224 ms, 2.363 ms]	793.94 µs (52.9%)
iast_GLOBAL	2.344 ms [2.274 ms, 2.414 ms]	844.332 µs (56.3%)
profiling	2.12 ms [2.064 ms, 2.175 ms]	619.957 µs (41.3%)
tracing	2.097 ms [2.043 ms, 2.151 ms]	597.476 µs (39.8%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.502 ms [1.49 ms, 1.514 ms]	-
appsec	2.559 ms [2.504 ms, 2.614 ms]	1.057 ms (70.4%)
iast	2.282 ms [2.213 ms, 2.351 ms]	779.955 µs (51.9%)
iast_GLOBAL	2.352 ms [2.282 ms, 2.422 ms]	850.068 µs (56.6%)
profiling	2.112 ms [2.057 ms, 2.167 ms]	609.942 µs (40.6%)
tracing	2.098 ms [2.045 ms, 2.152 ms]	596.529 µs (39.7%)

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.681 s) : 15681000, 15681000
.   : milestone, 15681000,
appsec (14.64 s) : 14640000, 14640000
.   : milestone, 14640000,
iast (18.443 s) : 18443000, 18443000
.   : milestone, 18443000,
iast_GLOBAL (18.058 s) : 18058000, 18058000
.   : milestone, 18058000,
profiling (15.445 s) : 15445000, 15445000
.   : milestone, 15445000,
tracing (14.892 s) : 14892000, 14892000
.   : milestone, 14892000,
section candidate
no_agent (14.941 s) : 14941000, 14941000
.   : milestone, 14941000,
appsec (14.948 s) : 14948000, 14948000
.   : milestone, 14948000,
iast (18.341 s) : 18341000, 18341000
.   : milestone, 18341000,
iast_GLOBAL (17.978 s) : 17978000, 17978000
.   : milestone, 17978000,
profiling (15.038 s) : 15038000, 15038000
.   : milestone, 15038000,
tracing (14.942 s) : 14942000, 14942000
.   : milestone, 14942000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.681 s [15.681 s, 15.681 s]	-
appsec	14.64 s [14.64 s, 14.64 s]	-1.041 s (-6.6%)
iast	18.443 s [18.443 s, 18.443 s]	2.762 s (17.6%)
iast_GLOBAL	18.058 s [18.058 s, 18.058 s]	2.377 s (15.2%)
profiling	15.445 s [15.445 s, 15.445 s]	-236.0 ms (-1.5%)
tracing	14.892 s [14.892 s, 14.892 s]	-789.0 ms (-5.0%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	14.941 s [14.941 s, 14.941 s]	-
appsec	14.948 s [14.948 s, 14.948 s]	7.0 ms (0.0%)
iast	18.341 s [18.341 s, 18.341 s]	3.4 s (22.8%)
iast_GLOBAL	17.978 s [17.978 s, 17.978 s]	3.037 s (20.3%)
profiling	15.038 s [15.038 s, 15.038 s]	97.0 ms (0.6%)
tracing	14.942 s [14.942 s, 14.942 s]	1.0 ms (0.0%)

Record a `feature_flag.evaluations` OTel counter on every flag evaluation using an OpenFeature `finallyAfter` hook. The hook captures all evaluation paths including type mismatches that occur above the provider level. Attributes: feature_flag.key, feature_flag.result.variant, feature_flag.result.reason, error.type (on error), feature_flag.result.allocation_key (when present). Counter is a no-op when DD_METRICS_OTEL_ENABLED is false or opentelemetry-api is absent from the classpath.

Replace GlobalOpenTelemetry.getMeterProvider() with a dedicated SdkMeterProvider + OtlpHttpMetricExporter that sends metrics directly to the DD Agent's OTLP endpoint (default :4318/v1/metrics). This avoids the agent's OTel class shading issue where the agent relocates io.opentelemetry.api.* to datadog.trace.bootstrap.otel.api.*, making GlobalOpenTelemetry calls from the dd-openfeature jar hit the unshaded no-op provider instead of the agent's shim. Requires opentelemetry-sdk-metrics and opentelemetry-exporter-otlp on the application classpath. Falls back to no-op if absent. System tests: 11/17 pass. 6 failures are pre-existing DDEvaluator gaps (reason mapping, parse errors, type mismatch strictness).

- Add explicit null guard for details in FlagEvalHook.finallyAfter() - Add OTEL_EXPORTER_OTLP_ENDPOINT generic env var fallback with /v1/metrics path appended (per OTel spec fallback chain) - Add comments clarifying signal-specific vs generic endpoint behavior

When the OTel SDK jars are not on the application classpath, loading FlagEvalMetrics fails because field types reference OTel SDK classes (SdkMeterProvider). This propagated as an uncaught NoClassDefFoundError from the Provider constructor, crashing provider initialization. Fix: - Change meterProvider field type from SdkMeterProvider to Closeable (always on classpath), use local SdkMeterProvider variable inside try block - Catch NoClassDefFoundError in Provider constructor when creating FlagEvalMetrics - Null-safe getProviderHooks() and shutdown() when metrics is null

FlagEvalHook references FlagEvalMetrics in its field declaration. On JVMs that eagerly verify field types during class loading, constructing FlagEvalHook outside the try/catch could throw NoClassDefFoundError if OTel classes failed to load. Moving it inside the try block ensures both metrics and hook are null-safe when OTel is absent.

Documents the published artifact setup, evaluation metrics dependencies (opentelemetry-sdk-metrics, opentelemetry-exporter-otlp), OTLP endpoint configuration, metric attributes, and requirements.

System.getenv() is forbidden by the project's forbiddenApis rules. Replace with ConfigHelper.env() which is the approved way to read environment variables. Add config-utils as compileOnly dependency.

sameerank

Thanks for helping with this! I agree it was a good idea to break out the system test fixes into separate PRs to keep this one brief and focused

products/feature-flagging/feature-flagging-api/README.md

...-flagging/feature-flagging-api/src/main/java/datadog/trace/api/openfeature/FlagEvalHook.java

manuel-alvarez-alvarez · 2026-04-10T16:14:10Z

...agging/feature-flagging-api/src/main/java/datadog/trace/api/openfeature/FlagEvalMetrics.java

+              .setUnit(METRIC_UNIT)
+              .setDescription(METRIC_DESC)
+              .build();
+    } catch (NoClassDefFoundError | Exception e) {


Wouldn't it be better to just let the error flow to the Provider class since it's already capturing the exception?

Catching and logging here lets the Metrics driver still operate as a no-op.

manuel-alvarez-alvarez

LGTM, just left a couple of minor comments

- Remove transitive openfeature-sdk dep from README setup section - Import ErrorCode at top of FlagEvalHook instead of inline FQN

…gging

- Add Options.evaluationLogging(boolean) — default true per EVALLOG.12 - When disabled: no metrics, no hook, no error - When enabled + OTel SDK missing: log.error with instructions to add deps or disable, degrade to no-op (matches Go/Python pattern) - When enabled + OTel init failure: log.error with message, degrade - Remove silent catch — FlagEvalMetrics now logs at error level for NoClassDefFoundError and at error level for other init failures

The OTel SDK defaults to DELTA temporality for counters. The Datadog agent converts OTLP delta monotonic sums to rate metrics by dividing by the export interval (10s). Five evaluations in under 1s produce ~0.5, which rounds to zero in the points payload. Force CUMULATIVE temporality on the OtlpHttpMetricExporter so the agent receives an absolute count rather than a rate, making test_ffe_eval_metric_count reliable.

…on in FlagEvalMetrics

- Remove exporterIsConfiguredWithCumulativeTemporalityForCounters test (tested OTel SDK, not our code; the integration test is the real regression guard) - Fix Provider catch block comment to reflect that FlagEvalMetrics may not have logged if we reach this point - Include exception in log.error calls for NoClassDefFoundError and general Exception to aid debugging - Reword InMemoryMetricReader comment for precision

- Add debug log to FlagEvalMetrics.record() catch block so metric recording failures are visible in debug logs - Widen Provider catch from NoClassDefFoundError to LinkageError to cover IncompatibleClassChangeError and other classloader issues from incompatible OTel SDK versions - Add slf4j logger to Provider and log at error level when the fallback catch fires

The Provider catch is defense-in-depth for when FlagEvalMetrics class itself can't load (OTel API absent entirely). The detailed error message is logged inside FlagEvalMetrics when it CAN load but SDK init fails. Using error level here caused the openfeature smoke test to fail (it asserts no ERROR entries in application logs).

typotter · 2026-04-14T14:55:51Z

/merge

gh-worker-devflow-routing-ef8351 · 2026-04-14T14:55:57Z

View all feedbacks in Devflow UI.

2026-04-14 14:55:56 UTC ℹ️ Start processing command /merge

2026-04-14 14:56:03 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 2h (p90).

2026-04-14 15:38:10 UTC ℹ️ MergeQueue: Readding this merge request to the queue because another merge request processed with yours failed. No action is needed from your side.

2026-04-14 16:01:23 UTC ℹ️ MergeQueue: Retrying because an high priority merge request needed to be processed first. No action is needed from your side.

2026-04-14 16:01:27 UTC ⚠️ MergeQueue: This merge request build was cancelled

tyler.potter@datadoghq.com cancelled this merge request build

Evaluation metrics are always attempted. If the OTel SDK is absent, the provider degrades gracefully with a warning. There is no user- facing toggle to disable metrics — this matches the Go and Python SDKs which also always attempt metrics.

typotter added type: feature request tag: ai generated Largely based on code generated by an AI or LLM comp: openfeature OpenFeature labels Apr 2, 2026

typotter added 5 commits April 9, 2026 09:29

typotter force-pushed the typo/evaluations-logging branch from 4cb7bab to 69c5529 Compare April 9, 2026 15:30

Add README for dd-openfeature with eval metrics setup

18c0441

Documents the published artifact setup, evaluation metrics dependencies (opentelemetry-sdk-metrics, opentelemetry-exporter-otlp), OTLP endpoint configuration, metric attributes, and requirements.

typotter marked this pull request as ready for review April 9, 2026 17:41

typotter requested a review from a team as a code owner April 9, 2026 17:41

typotter requested review from leoromanovsky and sameerank and removed request for a team April 9, 2026 17:41

Use ConfigHelper.env() instead of System.getenv()

da92198

System.getenv() is forbidden by the project's forbiddenApis rules. Replace with ConfigHelper.env() which is the approved way to read environment variables. Add config-utils as compileOnly dependency.

typotter requested a review from manuel-alvarez-alvarez April 9, 2026 17:52

sameerank approved these changes Apr 10, 2026

View reviewed changes

manuel-alvarez-alvarez reviewed Apr 10, 2026

View reviewed changes

products/feature-flagging/feature-flagging-api/README.md Outdated Show resolved Hide resolved

manuel-alvarez-alvarez reviewed Apr 10, 2026

View reviewed changes

...-flagging/feature-flagging-api/src/main/java/datadog/trace/api/openfeature/FlagEvalHook.java Outdated Show resolved Hide resolved

manuel-alvarez-alvarez reviewed Apr 10, 2026

View reviewed changes

manuel-alvarez-alvarez approved these changes Apr 10, 2026

View reviewed changes

typotter added 5 commits April 10, 2026 13:15

Address PR review feedback from manuel-alvarez-alvarez

340f25c

- Remove transitive openfeature-sdk dep from README setup section - Import ErrorCode at top of FlagEvalHook instead of inline FQN

Merge remote-tracking branch 'origin/master' into typo/evaluations-lo…

12c5ba9

…gging

test(openfeature): verify cumulative temporality and count accumulati…

c3a7955

…on in FlagEvalMetrics

typotter mentioned this pull request Apr 13, 2026

Fix feature_flag.evaluations metric count always being zero #11072

Closed

3 tasks

typotter requested a review from manuel-alvarez-alvarez April 14, 2026 08:07

manuel-alvarez-alvarez approved these changes Apr 14, 2026

View reviewed changes

typotter enabled auto-merge April 14, 2026 16:20

typotter mentioned this pull request Apr 14, 2026

Add evaluation metrics section to Java feature flags docs DataDog/documentation#35970

Draft

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add flag evaluation metrics via OTel counter and OpenFeature Hook#11040

Add flag evaluation metrics via OTel counter and OpenFeature Hook#11040
typotter wants to merge 16 commits intomasterfrom
typo/evaluations-logging

typotter commented Apr 2, 2026 •

edited by atlassian bot

Loading

Uh oh!

pr-commenter bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

sameerank left a comment

Uh oh!

Uh oh!

Uh oh!

manuel-alvarez-alvarez Apr 10, 2026

Uh oh!

typotter Apr 14, 2026

Uh oh!

manuel-alvarez-alvarez left a comment

Uh oh!

typotter commented Apr 14, 2026

Uh oh!

gh-worker-devflow-routing-ef8351 bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

typotter commented Apr 2, 2026 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

pr-commenter bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

sameerank left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

manuel-alvarez-alvarez Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

typotter Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

manuel-alvarez-alvarez left a comment

Choose a reason for hiding this comment

Uh oh!

typotter commented Apr 14, 2026

Uh oh!

gh-worker-devflow-routing-ef8351 bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

typotter commented Apr 2, 2026 •

edited by atlassian bot

Loading

pr-commenter bot commented Apr 8, 2026 •

edited

Loading

gh-worker-devflow-routing-ef8351 bot commented Apr 14, 2026 •

edited

Loading