Add flag evaluation metrics via OTel counter and OpenFeature Hook#11040
Add flag evaluation metrics via OTel counter and OpenFeature Hook#11040
Conversation
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 13 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.055 s) : 0, 1054701
Total [baseline] (8.838 s) : 0, 8837845
Agent [candidate] (1.056 s) : 0, 1056172
Total [candidate] (8.845 s) : 0, 8844645
section iast
Agent [baseline] (1.225 s) : 0, 1225204
Total [baseline] (9.579 s) : 0, 9579185
Agent [candidate] (1.225 s) : 0, 1224723
Total [candidate] (9.579 s) : 0, 9578576
gantt
title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.228 ms) : 0, 1228
crashtracking [candidate] (1.209 ms) : 0, 1209
BytebuddyAgent [baseline] (632.642 ms) : 0, 632642
BytebuddyAgent [candidate] (632.853 ms) : 0, 632853
AgentMeter [baseline] (29.478 ms) : 0, 29478
AgentMeter [candidate] (29.328 ms) : 0, 29328
GlobalTracer [baseline] (248.626 ms) : 0, 248626
GlobalTracer [candidate] (248.492 ms) : 0, 248492
AppSec [baseline] (32.06 ms) : 0, 32060
AppSec [candidate] (32.1 ms) : 0, 32100
Debugger [baseline] (59.155 ms) : 0, 59155
Debugger [candidate] (59.319 ms) : 0, 59319
Remote Config [baseline] (597.5 µs) : 0, 598
Remote Config [candidate] (592.601 µs) : 0, 593
Telemetry [baseline] (8.069 ms) : 0, 8069
Telemetry [candidate] (8.096 ms) : 0, 8096
Flare Poller [baseline] (6.769 ms) : 0, 6769
Flare Poller [candidate] (8.14 ms) : 0, 8140
section iast
crashtracking [baseline] (1.227 ms) : 0, 1227
crashtracking [candidate] (1.227 ms) : 0, 1227
BytebuddyAgent [baseline] (801.631 ms) : 0, 801631
BytebuddyAgent [candidate] (802.325 ms) : 0, 802325
AgentMeter [baseline] (11.402 ms) : 0, 11402
AgentMeter [candidate] (11.419 ms) : 0, 11419
GlobalTracer [baseline] (239.371 ms) : 0, 239371
GlobalTracer [candidate] (239.338 ms) : 0, 239338
IAST [baseline] (25.892 ms) : 0, 25892
IAST [candidate] (25.864 ms) : 0, 25864
AppSec [baseline] (28.785 ms) : 0, 28785
AppSec [candidate] (30.229 ms) : 0, 30229
Debugger [baseline] (63.178 ms) : 0, 63178
Debugger [candidate] (62.446 ms) : 0, 62446
Remote Config [baseline] (1.164 ms) : 0, 1164
Remote Config [candidate] (536.498 µs) : 0, 536
Telemetry [baseline] (12.846 ms) : 0, 12846
Telemetry [candidate] (11.477 ms) : 0, 11477
Flare Poller [baseline] (3.415 ms) : 0, 3415
Flare Poller [candidate] (3.558 ms) : 0, 3558
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.056 s) : 0, 1056136
Total [baseline] (11.171 s) : 0, 11170559
Agent [candidate] (1.055 s) : 0, 1055112
Total [candidate] (11.069 s) : 0, 11068635
section appsec
Agent [baseline] (1.249 s) : 0, 1249109
Total [baseline] (11.125 s) : 0, 11125396
Agent [candidate] (1.247 s) : 0, 1246854
Total [candidate] (11.064 s) : 0, 11063540
section iast
Agent [baseline] (1.243 s) : 0, 1242623
Total [baseline] (11.297 s) : 0, 11296861
Agent [candidate] (1.23 s) : 0, 1229872
Total [candidate] (11.267 s) : 0, 11267069
section profiling
Agent [baseline] (1.187 s) : 0, 1187372
Total [baseline] (11.169 s) : 0, 11168511
Agent [candidate] (1.182 s) : 0, 1182329
Total [candidate] (11.051 s) : 0, 11050685
gantt
title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.231 ms) : 0, 1231
crashtracking [candidate] (1.217 ms) : 0, 1217
BytebuddyAgent [baseline] (632.767 ms) : 0, 632767
BytebuddyAgent [candidate] (631.1 ms) : 0, 631100
AgentMeter [baseline] (29.43 ms) : 0, 29430
AgentMeter [candidate] (29.416 ms) : 0, 29416
GlobalTracer [baseline] (249.195 ms) : 0, 249195
GlobalTracer [candidate] (248.948 ms) : 0, 248948
AppSec [baseline] (31.894 ms) : 0, 31894
AppSec [candidate] (32.057 ms) : 0, 32057
Debugger [baseline] (60.094 ms) : 0, 60094
Debugger [candidate] (60.149 ms) : 0, 60149
Remote Config [baseline] (596.779 µs) : 0, 597
Remote Config [candidate] (609.218 µs) : 0, 609
Telemetry [baseline] (8.1 ms) : 0, 8100
Telemetry [candidate] (8.163 ms) : 0, 8163
Flare Poller [baseline] (6.717 ms) : 0, 6717
Flare Poller [candidate] (7.34 ms) : 0, 7340
section appsec
crashtracking [baseline] (1.226 ms) : 0, 1226
crashtracking [candidate] (1.214 ms) : 0, 1214
BytebuddyAgent [baseline] (661.92 ms) : 0, 661920
BytebuddyAgent [candidate] (661.223 ms) : 0, 661223
AgentMeter [baseline] (12.056 ms) : 0, 12056
AgentMeter [candidate] (12.078 ms) : 0, 12078
GlobalTracer [baseline] (249.511 ms) : 0, 249511
GlobalTracer [candidate] (248.932 ms) : 0, 248932
IAST [baseline] (24.618 ms) : 0, 24618
IAST [candidate] (24.574 ms) : 0, 24574
AppSec [baseline] (184.913 ms) : 0, 184913
AppSec [candidate] (183.915 ms) : 0, 183915
Debugger [baseline] (65.792 ms) : 0, 65792
Debugger [candidate] (65.715 ms) : 0, 65715
Remote Config [baseline] (629.984 µs) : 0, 630
Remote Config [candidate] (597.85 µs) : 0, 598
Telemetry [baseline] (8.582 ms) : 0, 8582
Telemetry [candidate] (8.69 ms) : 0, 8690
Flare Poller [baseline] (3.559 ms) : 0, 3559
Flare Poller [candidate] (3.566 ms) : 0, 3566
section iast
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.235 ms) : 0, 1235
BytebuddyAgent [baseline] (814.562 ms) : 0, 814562
BytebuddyAgent [candidate] (805.386 ms) : 0, 805386
AgentMeter [baseline] (11.66 ms) : 0, 11660
AgentMeter [candidate] (11.449 ms) : 0, 11449
GlobalTracer [baseline] (242.232 ms) : 0, 242232
GlobalTracer [candidate] (240.501 ms) : 0, 240501
IAST [baseline] (26.233 ms) : 0, 26233
IAST [candidate] (26.718 ms) : 0, 26718
AppSec [baseline] (31.186 ms) : 0, 31186
AppSec [candidate] (30.881 ms) : 0, 30881
Debugger [baseline] (60.677 ms) : 0, 60677
Debugger [candidate] (61.309 ms) : 0, 61309
Remote Config [baseline] (525.706 µs) : 0, 526
Remote Config [candidate] (525.674 µs) : 0, 526
Telemetry [baseline] (13.537 ms) : 0, 13537
Telemetry [candidate] (11.911 ms) : 0, 11911
Flare Poller [baseline] (3.55 ms) : 0, 3550
Flare Poller [candidate] (3.514 ms) : 0, 3514
section profiling
crashtracking [baseline] (1.187 ms) : 0, 1187
crashtracking [candidate] (1.188 ms) : 0, 1188
BytebuddyAgent [baseline] (693.378 ms) : 0, 693378
BytebuddyAgent [candidate] (690.17 ms) : 0, 690170
AgentMeter [baseline] (9.128 ms) : 0, 9128
AgentMeter [candidate] (9.065 ms) : 0, 9065
GlobalTracer [baseline] (207.772 ms) : 0, 207772
GlobalTracer [candidate] (206.849 ms) : 0, 206849
AppSec [baseline] (32.631 ms) : 0, 32631
AppSec [candidate] (32.53 ms) : 0, 32530
Debugger [baseline] (65.809 ms) : 0, 65809
Debugger [candidate] (65.561 ms) : 0, 65561
Remote Config [baseline] (569.285 µs) : 0, 569
Remote Config [candidate] (579.628 µs) : 0, 580
Telemetry [baseline] (7.871 ms) : 0, 7871
Telemetry [candidate] (7.846 ms) : 0, 7846
Flare Poller [baseline] (3.572 ms) : 0, 3572
Flare Poller [candidate] (3.594 ms) : 0, 3594
ProfilingAgent [baseline] (94.043 ms) : 0, 94043
ProfilingAgent [candidate] (93.834 ms) : 0, 93834
Profiling [baseline] (94.604 ms) : 0, 94604
Profiling [candidate] (94.39 ms) : 0, 94390
LoadParameters
See matching parameters
SummaryFound 3 performance improvements and 1 performance regressions! Performance is the same for 17 metrics, 15 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
dateFormat X
axisFormat %s
section baseline
no_agent (19.173 ms) : 18977, 19368
. : milestone, 19173,
appsec (18.798 ms) : 18609, 18987
. : milestone, 18798,
code_origins (17.942 ms) : 17768, 18116
. : milestone, 17942,
iast (17.878 ms) : 17699, 18057
. : milestone, 17878,
profiling (18.902 ms) : 18712, 19091
. : milestone, 18902,
tracing (17.913 ms) : 17733, 18093
. : milestone, 17913,
section candidate
no_agent (16.942 ms) : 16778, 17106
. : milestone, 16942,
appsec (18.954 ms) : 18763, 19144
. : milestone, 18954,
code_origins (18.036 ms) : 17859, 18213
. : milestone, 18036,
iast (17.909 ms) : 17734, 18085
. : milestone, 17909,
profiling (18.207 ms) : 18029, 18385
. : milestone, 18207,
tracing (18.279 ms) : 18098, 18460
. : milestone, 18279,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
dateFormat X
axisFormat %s
section baseline
no_agent (1.252 ms) : 1239, 1264
. : milestone, 1252,
iast (3.344 ms) : 3294, 3393
. : milestone, 3344,
iast_FULL (6.204 ms) : 6139, 6269
. : milestone, 6204,
iast_GLOBAL (3.638 ms) : 3584, 3692
. : milestone, 3638,
profiling (2.439 ms) : 2415, 2462
. : milestone, 2439,
tracing (1.922 ms) : 1906, 1938
. : milestone, 1922,
section candidate
no_agent (1.247 ms) : 1235, 1260
. : milestone, 1247,
iast (3.465 ms) : 3414, 3515
. : milestone, 3465,
iast_FULL (6.197 ms) : 6133, 6260
. : milestone, 6197,
iast_GLOBAL (3.612 ms) : 3561, 3663
. : milestone, 3612,
profiling (2.315 ms) : 2292, 2338
. : milestone, 2315,
tracing (1.928 ms) : 1912, 1944
. : milestone, 1928,
DacapoParameters
See matching parameters
SummaryFound 1 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 0 unstable metrics.
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
dateFormat X
axisFormat %s
section baseline
no_agent (1.5 ms) : 1488, 1511
. : milestone, 1500,
appsec (3.813 ms) : 3594, 4033
. : milestone, 3813,
iast (2.294 ms) : 2224, 2363
. : milestone, 2294,
iast_GLOBAL (2.344 ms) : 2274, 2414
. : milestone, 2344,
profiling (2.12 ms) : 2064, 2175
. : milestone, 2120,
tracing (2.097 ms) : 2043, 2151
. : milestone, 2097,
section candidate
no_agent (1.502 ms) : 1490, 1514
. : milestone, 1502,
appsec (2.559 ms) : 2504, 2614
. : milestone, 2559,
iast (2.282 ms) : 2213, 2351
. : milestone, 2282,
iast_GLOBAL (2.352 ms) : 2282, 2422
. : milestone, 2352,
profiling (2.112 ms) : 2057, 2167
. : milestone, 2112,
tracing (2.098 ms) : 2045, 2152
. : milestone, 2098,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~93af7a8bd4, baseline=1.62.0-SNAPSHOT~9f89a0b26cc
dateFormat X
axisFormat %s
section baseline
no_agent (15.681 s) : 15681000, 15681000
. : milestone, 15681000,
appsec (14.64 s) : 14640000, 14640000
. : milestone, 14640000,
iast (18.443 s) : 18443000, 18443000
. : milestone, 18443000,
iast_GLOBAL (18.058 s) : 18058000, 18058000
. : milestone, 18058000,
profiling (15.445 s) : 15445000, 15445000
. : milestone, 15445000,
tracing (14.892 s) : 14892000, 14892000
. : milestone, 14892000,
section candidate
no_agent (14.941 s) : 14941000, 14941000
. : milestone, 14941000,
appsec (14.948 s) : 14948000, 14948000
. : milestone, 14948000,
iast (18.341 s) : 18341000, 18341000
. : milestone, 18341000,
iast_GLOBAL (17.978 s) : 17978000, 17978000
. : milestone, 17978000,
profiling (15.038 s) : 15038000, 15038000
. : milestone, 15038000,
tracing (14.942 s) : 14942000, 14942000
. : milestone, 14942000,
|
Record a `feature_flag.evaluations` OTel counter on every flag evaluation using an OpenFeature `finallyAfter` hook. The hook captures all evaluation paths including type mismatches that occur above the provider level. Attributes: feature_flag.key, feature_flag.result.variant, feature_flag.result.reason, error.type (on error), feature_flag.result.allocation_key (when present). Counter is a no-op when DD_METRICS_OTEL_ENABLED is false or opentelemetry-api is absent from the classpath.
Replace GlobalOpenTelemetry.getMeterProvider() with a dedicated SdkMeterProvider + OtlpHttpMetricExporter that sends metrics directly to the DD Agent's OTLP endpoint (default :4318/v1/metrics). This avoids the agent's OTel class shading issue where the agent relocates io.opentelemetry.api.* to datadog.trace.bootstrap.otel.api.*, making GlobalOpenTelemetry calls from the dd-openfeature jar hit the unshaded no-op provider instead of the agent's shim. Requires opentelemetry-sdk-metrics and opentelemetry-exporter-otlp on the application classpath. Falls back to no-op if absent. System tests: 11/17 pass. 6 failures are pre-existing DDEvaluator gaps (reason mapping, parse errors, type mismatch strictness).
- Add explicit null guard for details in FlagEvalHook.finallyAfter() - Add OTEL_EXPORTER_OTLP_ENDPOINT generic env var fallback with /v1/metrics path appended (per OTel spec fallback chain) - Add comments clarifying signal-specific vs generic endpoint behavior
When the OTel SDK jars are not on the application classpath, loading FlagEvalMetrics fails because field types reference OTel SDK classes (SdkMeterProvider). This propagated as an uncaught NoClassDefFoundError from the Provider constructor, crashing provider initialization. Fix: - Change meterProvider field type from SdkMeterProvider to Closeable (always on classpath), use local SdkMeterProvider variable inside try block - Catch NoClassDefFoundError in Provider constructor when creating FlagEvalMetrics - Null-safe getProviderHooks() and shutdown() when metrics is null
FlagEvalHook references FlagEvalMetrics in its field declaration. On JVMs that eagerly verify field types during class loading, constructing FlagEvalHook outside the try/catch could throw NoClassDefFoundError if OTel classes failed to load. Moving it inside the try block ensures both metrics and hook are null-safe when OTel is absent.
4cb7bab to
69c5529
Compare
Documents the published artifact setup, evaluation metrics dependencies (opentelemetry-sdk-metrics, opentelemetry-exporter-otlp), OTLP endpoint configuration, metric attributes, and requirements.
System.getenv() is forbidden by the project's forbiddenApis rules. Replace with ConfigHelper.env() which is the approved way to read environment variables. Add config-utils as compileOnly dependency.
sameerank
left a comment
There was a problem hiding this comment.
Thanks for helping with this! I agree it was a good idea to break out the system test fixes into separate PRs to keep this one brief and focused
...-flagging/feature-flagging-api/src/main/java/datadog/trace/api/openfeature/FlagEvalHook.java
Outdated
Show resolved
Hide resolved
| .setUnit(METRIC_UNIT) | ||
| .setDescription(METRIC_DESC) | ||
| .build(); | ||
| } catch (NoClassDefFoundError | Exception e) { |
There was a problem hiding this comment.
Wouldn't it be better to just let the error flow to the Provider class since it's already capturing the exception?
There was a problem hiding this comment.
Catching and logging here lets the Metrics driver still operate as a no-op.
manuel-alvarez-alvarez
left a comment
There was a problem hiding this comment.
LGTM, just left a couple of minor comments
- Remove transitive openfeature-sdk dep from README setup section - Import ErrorCode at top of FlagEvalHook instead of inline FQN
- Add Options.evaluationLogging(boolean) — default true per EVALLOG.12 - When disabled: no metrics, no hook, no error - When enabled + OTel SDK missing: log.error with instructions to add deps or disable, degrade to no-op (matches Go/Python pattern) - When enabled + OTel init failure: log.error with message, degrade - Remove silent catch — FlagEvalMetrics now logs at error level for NoClassDefFoundError and at error level for other init failures
The OTel SDK defaults to DELTA temporality for counters. The Datadog agent converts OTLP delta monotonic sums to rate metrics by dividing by the export interval (10s). Five evaluations in under 1s produce ~0.5, which rounds to zero in the points payload. Force CUMULATIVE temporality on the OtlpHttpMetricExporter so the agent receives an absolute count rather than a rate, making test_ffe_eval_metric_count reliable.
…on in FlagEvalMetrics
- Remove exporterIsConfiguredWithCumulativeTemporalityForCounters test (tested OTel SDK, not our code; the integration test is the real regression guard) - Fix Provider catch block comment to reflect that FlagEvalMetrics may not have logged if we reach this point - Include exception in log.error calls for NoClassDefFoundError and general Exception to aid debugging - Reword InMemoryMetricReader comment for precision
- Add debug log to FlagEvalMetrics.record() catch block so metric recording failures are visible in debug logs - Widen Provider catch from NoClassDefFoundError to LinkageError to cover IncompatibleClassChangeError and other classloader issues from incompatible OTel SDK versions - Add slf4j logger to Provider and log at error level when the fallback catch fires
The Provider catch is defense-in-depth for when FlagEvalMetrics class itself can't load (OTel API absent entirely). The detailed error message is logged inside FlagEvalMetrics when it CAN load but SDK init fails. Using error level here caused the openfeature smoke test to fail (it asserts no ERROR entries in application logs).
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
tyler.potter@datadoghq.com cancelled this merge request build |
Evaluation metrics are always attempted. If the OTel SDK is absent, the provider degrades gracefully with a warning. There is no user- facing toggle to disable metrics — this matches the Go and Python SDKs which also always attempt metrics.
What Does This Do
Records a
feature_flag.evaluationsOTel counter metric on every flag evaluation via an OpenFeaturefinallyAfterhook. The hook captures all evaluation paths including type mismatches that occur above the provider level in the OpenFeature SDK pipeline.Creates a dedicated
SdkMeterProviderwith anOtlpHttpMetricExporterthat sends metrics directly to the DD Agent's OTLP endpoint (/v1/metrics). This avoids the agent's OTel class shading (io.opentelemetry.api.*→datadog.trace.bootstrap.otel.api.*) which prevents usingGlobalOpenTelemetryfrom the publisheddd-openfeaturejar.Metric attributes:
feature_flag.keyfeature_flag.result.variantfeature_flag.result.reasonerror.typefeature_flag.result.allocation_keyNew files:
FlagEvalMetrics.java,FlagEvalHook.java,FlagEvalMetricsTest.java,FlagEvalHookTest.javaModified files:
Provider.java(addsgetProviderHooks()),ProviderTest.java,build.gradle.ktsMotivation
Evaluation metrics allow tracking how many times flags are evaluated, with which results, across sessions. This is the Java implementation of the evaluation logging spec (FFL-1942), matching the existing Python (dd-trace-py#17029) and Go (dd-trace-go#4489) implementations.
System tests: 11/17 pass. The 6 remaining failures are pre-existing DDEvaluator gaps (reason mapping, parse error codes) addressed in separate PRs (#11036, #10971).
References:
Additional Notes
opentelemetry-sdk-metrics,opentelemetry-exporter-otlp) arecompileOnly— applications must include them on the classpath for metrics to flow. Falls back to silent no-op when absent.OTEL_EXPORTER_OTLP_METRICS_ENDPOINT→OTEL_EXPORTER_OTLP_ENDPOINT+/v1/metrics→http://localhost:4318/v1/metricsContributor Checklist
type:and (comp:orinst:) labelsclose,fix, or any linking keywords when referencing an issueJira ticket: FFL-1942