feat: add samply profiler support by not-matthias · Pull Request #338 · CodSpeedHQ/codspeed

not-matthias · 2026-05-05T14:15:18Z

No description provided.

codspeed-hq · 2026-05-05T14:18:29Z

Merging this PR will not alter performance

✅ 4 untouched benchmarks

_{Comparing cod-2599-runner-migrate-perf-to-new-profiler-agnostic-data-format (8bbdf4f) with main (fd8701d)}

Move ELF/DWARF artifact extraction (elf_helper, module_symbols, unwind_data, debug_info, jit_dump) and the LoadedModule struct out of perf/ into a shared symbolication/ module. These helpers are agnostic to which sampling tool produced the module list and will be reused by future Linux profilers.

Output filename also changes from perf.metadata to walltime.metadata. This is a wire-breaking change that requires a coordinated server rollout to read the new filename.

PerfRunner becomes PerfProfiler implementing the Profiler trait. The RunnerFifo loop moves out of the perf module and into WallTimeExecutor, which now drives the trait callbacks (on_start_benchmark / on_stop_benchmark / on_ping / GetIntegrationMode) generically. Profiler::wrap stashes the perf control fifo and output path on the profiler instance; finalize harvests the perf.data artifacts and writes walltime.metadata. The OnceCell<BenchmarkData> bridge is gone — fifo data and timestamps are passed directly into finalize. Executor::run now takes &mut self to allow profilers to hold per-run state on the trait object. This cascades through run_executor, orchestrator, and the Memory/Valgrind executors (no behavioral change for those — they don't mutate self).

…BLED Introduce --enable-profiler / CODSPEED_PROFILER_ENABLED as the canonical flag now that walltime profiling is profiler-agnostic. The legacy --enable-perf / CODSPEED_PERF_ENABLED is kept as a hidden alias and emits a deprecation warning when used, so existing configurations keep working.

Tokio's OpenOptions::read_write is Linux-only, but the O_RDWR trick works on every Unix and is required on macOS to avoid open() deadlocks when both FIFO ends are opened before the peer process is spawned.

Mirror the clock used elsewhere in CodSpeed on macOS (mach_absolute_time with cached mach_timebase_info) so runner and benchmark process timestamps come from the exact same source. Adds mach2 as a macOS-only dependency.

Move the Linux profiling sysctl setup out of the perf profiler and into a shared walltime profiler helper. This lets both perf and samply prepare kernel symbol access and non-root profiling consistently while preserving the existing sudo-based behavior.

Use the samply crate directly instead of a standalone binary. Adds a `codspeed samply` subcommand that forwards args to samply's CLI and dispatches to `do_record_action`. The wall_time profiler re-execs the current binary, removing the need for a PATH-installed samply.

GuillaumeLagrange

Switching to github mid review, have to publish 😬

GuillaumeLagrange · 2026-05-07T15:26:58Z

+    ) -> anyhow::Result<CommandBuilder>;
+
+    /// The benchmarked process signaled the start of a measured region.
+    async fn on_start_benchmark(&mut self) -> anyhow::Result<()> {


Could we make use of the fact that we are changing stuff here to clear up the name of the markers?
We've discussed this multiple time, and it's quite a huge clusterfuck

The current names with markers is this

/// It expects the data to be structured like this: /// - SampleStart /// - BenchmarkStart /// - N samples /// - BenchmarkEnd /// - SampleEnd

But we also have the fifo commands StartBenchmark and StopBenchmark which actually push SampleStart/End markers

What I propose for this is just a rename that would make everything a bit clearer

The external marker, currently named SampleStart should be named ProfilerSamplingStart, or something that conveys the idea that we are starting and stopping the profiler's sampling functionality, and we should rename the FifoCommand::StartBenchmark to reflect that. Another naming possibility could be BenchmarkRegion if we want to be a bit farther from the implementation detail of starting/stopping the profiler

The internal marker would be clearer if it was something like RoundStart RoundEnd, as in practice it's what is being done.

WE SHOULD BE ABSOLUTELY EXTREMELY CAREFUL TO MAKE SURE IT DOES NOT IMPACT THE SERIALIZED IDs, it's only meant to be a naming clarification.

WDYT?

GuillaumeLagrange · 2026-05-07T15:26:59Z

                FifoCommand::GetIntegrationMode => {
                    return Ok(Some(FifoCommand::IntegrationModeResponse(
-                        IntegrationMode::Perf,
+                        IntegrationMode::Walltime,


We should make absolutely sure that the ser/de test inside instrument-hooks still works, and I believe explicitly updated.

Not sure which is the best order, but we need to be careful this does not get left behind as it had been before

GuillaumeLagrange

first pass done, let me know if you want to sync on some points

GuillaumeLagrange · 2026-05-07T15:27:48Z

@@ -28,18 +28,13 @@ use std::path::Path;
 use std::path::PathBuf;


Seen together, this commit is pretty much useless since smaply handles the symbolification

GuillaumeLagrange · 2026-05-07T15:37:04Z


 pub struct WallTimeExecutor {
-    perf: Option<PerfRunner>,
+    profiler: Option<Box<dyn Profiler>>,


Since it wont really change at runtime and is computed from static cfg asserts anyway, wouldnt it be better to have a type generic profiler rather than a Box<dyn> ?

GuillaumeLagrange · 2026-05-07T15:41:14Z

-        if let Some(perf) = &self.perf
-            && execution_context.config.enable_perf
-        {
-            perf.save_files_to(&execution_context.profile_folder)


Is there a reason the finalize part moved from teardown step to the execute step?

I'm guessing it has no side effect, but still conceptually it makes more sense to me to have the perf parsing etc as part of the executor teardown rather than inside the run step. WDYT?

GuillaumeLagrange · 2026-05-07T15:42:13Z

+            WallTimeExecutor::walltime_bench_cmd(&execution_context.config, execution_context)?;
+
+        let status = match self.profiler.as_mut() {
+            Some(profiler) if execution_context.config.enable_perf => {


Ignore if it is addressed in later commits, but we should rename everything that is perf related to profiler in the CLI interface, but still accept the perf options with a dperecation message

GuillaumeLagrange · 2026-05-07T15:49:30Z

@@ -152,16 +152,34 @@ pub enum UnwindingMode {

 #[derive(Args, Debug, Clone)]
 pub struct PerfRunArgs {


this should be renamed and the unwinding mode argument should be scoped to perf, could be done using a generic to avoid having to handle this at runtime

GuillaumeLagrange · 2026-05-07T15:56:08Z

+
+#[async_trait(?Send)]
+impl Profiler for SamplyProfiler {
+    async fn wrap(


can we rename this to wrap_command to make it slightly clearer?

GuillaumeLagrange · 2026-05-07T15:57:30Z

+    ) -> anyhow::Result<()> {
+        let Some(integration) = fifo_data.integration.clone() else {
+            warn!(
+                "Walltime profiling is enabled, but failed to detect benchmarks. If you wish to disable this warning, set CODSPEED_PROFILER_ENABLED=false"


Nit: could be commonized with perf error message

potentially through a trait function, called inside finalize, since the detection condition seem extremely similar

but it may be overkill, at least just commonize the message

GuillaumeLagrange · 2026-05-07T16:05:14Z

+    /// Run the bundled samply profiler. Args are forwarded to samply.
+    #[command(disable_help_flag = true, disable_help_subcommand = true)]
+    Samply(samply::SamplyArgs),
 }


Can we segregate these subcommands into a dedicated enum so we now at glance which commands are real entrypoints by users and which are actually just the cli re-invoking itself?

GuillaumeLagrange · 2026-05-07T16:06:39Z

-        let mut samply_builder = CommandBuilder::new("samply");
+        // samply is bundled into this binary as the `samply` subcommand;
+        // re-exec ourselves so we don't depend on a system install.
+        let current_exe = std::env::current_exe()


If we define the subcommands that are invoked by the CLI, we could have this as a helper that would just take the enum value and build the command with args etc

GuillaumeLagrange · 2026-05-07T16:07:42Z

-        cli::Action::Record(a) => ::samply::do_record_action(a),
-        _ => unimplemented!("Only `samply record` is supported"),
-    }
+    // samply spins up its own tokio runtime internally, so it must run on a


fixup with previous commit

not-matthias force-pushed the cod-2599-runner-migrate-perf-to-new-profiler-agnostic-data-format branch from 468b6c8 to 0e6e4ea Compare May 5, 2026 14:16

not-matthias requested a review from GuillaumeLagrange May 6, 2026 16:50

not-matthias added 10 commits May 6, 2026 18:50

feat(walltime): add Profiler trait abstraction

009bff1

refactor(walltime): rename IntegrationMode::Perf to Walltime

2fcc338

refactor(walltime): rename FifoCommand::PingPerf to PingProfiler

c288d7d

refactor(walltime): move perf module under profiler/

1b90106

refactor(walltime): rename PerfMetadata to WalltimeMetadata

2ebfba2

Output filename also changes from perf.metadata to walltime.metadata. This is a wire-breaking change that requires a coordinated server rollout to read the new filename.

fix(fifo): use O_RDWR to open FIFOs on all Unix platforms

2486f84

Tokio's OpenOptions::read_write is Linux-only, but the O_RDWR trick works on every Unix and is required on macOS to avoid open() deadlocks when both FIFO ends are opened before the peer process is spawned.

fix(walltime): use mach_absolute_time for FIFO timestamps on macOS

d27d3cd

Mirror the clock used elsewhere in CodSpeed on macOS (mach_absolute_time with cached mach_timebase_info) so runner and benchmark process timestamps come from the exact same source. Adds mach2 as a macOS-only dependency.

not-matthias force-pushed the cod-2599-runner-migrate-perf-to-new-profiler-agnostic-data-format branch from 6ad33b5 to f1d7ea2 Compare May 6, 2026 16:50

feat(walltime): add samply profiler for macOS

ed91edd

not-matthias force-pushed the cod-2599-runner-migrate-perf-to-new-profiler-agnostic-data-format branch from 9a18ed3 to b55c9b2 Compare May 7, 2026 13:37

not-matthias added 2 commits May 7, 2026 15:37

not-matthias force-pushed the cod-2599-runner-migrate-perf-to-new-profiler-agnostic-data-format branch from b55c9b2 to 9a2ab1b Compare May 7, 2026 13:37

fix(samply): spawn in new thread to prevent tokio panic

1fe921c

GuillaumeLagrange requested changes May 7, 2026

View reviewed changes

GuillaumeLagrange reviewed May 7, 2026

View reviewed changes

GuillaumeLagrange requested changes May 7, 2026

View reviewed changes

not-matthias added 2 commits May 7, 2026 19:13

chore: remove reuse-threads

f1fa303

fix: macos unknown addresses

8bbdf4f

		@@ -28,18 +28,13 @@ use std::path::Path;
		use std::path::PathBuf;

		@@ -152,16 +152,34 @@ pub enum UnwindingMode {

		#[derive(Args, Debug, Clone)]
		pub struct PerfRunArgs {

Conversation

not-matthias commented May 5, 2026

Uh oh!

codspeed-hq Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

GuillaumeLagrange left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GuillaumeLagrange left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq Bot commented May 5, 2026 •

edited

Loading