Skip to content

refactor: update gopsutil to v4, add overall host metrics#281

Open
Copilot wants to merge 29 commits intomainfrom
copilot/update-gopsutil-to-v4
Open

refactor: update gopsutil to v4, add overall host metrics#281
Copilot wants to merge 29 commits intomainfrom
copilot/update-gopsutil-to-v4

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 19, 2026

  • Understand current overall metrics implementation
  • Review existing network bandwidth detection per platform
  • Refactor to track per-disk I/O utilization with individual device monitoring
  • Refactor to track per-network-interface utilization with individual bandwidth detection
  • Update data structures to store per-device snapshots and bandwidth
  • Emit per-device metrics in the event (disk_io.devices, network.devices)
  • Update bottleneck detection to identify which specific disk/interface is the bottleneck (e.g., "disk_io:sda", "network:eth0")
  • Auto-detect bandwidth per network interface
  • Update release notes with new features
  • Address code review feedback - extract default bandwidth as named constant
  • Restore config/generated.go (reverted accidental removal)
  • Remove breaking change note (PR not merged yet)

Changes Summary

  • Each disk device is monitored independently with individual I/O utilization percentages
  • Each network interface is monitored with auto-detected bandwidth per interface
  • Event output now includes:
    • disk_io.devices: Map of device name -> utilization
    • disk_io.bottleneck_device: Name of the most utilized disk
    • network.devices: Map of interface name -> {used_percent, bandwidth_mbps}
    • network.bottleneck_device: Name of the most utilized interface
  • Bottleneck identification now includes specific device names (e.g., disk_io:nvme0n1, network:eth0)

Copilot AI and others added 10 commits April 19, 2026 02:27
- Update all import paths from github.com/shirou/gopsutil/v3 to v4
  across 10 source files (disk, cpu, memory, network, stats, app,
  status, and host/sys_info)
- Update go.mod dependency to gopsutil/v4 v4.26.3
- Bump minimum Go version to 1.24.0 (required by gopsutil v4)
- Update transitive dependencies (go-sysconf, numcpus, perfstat, sys)

Key improvement: disk.IOCounters() on darwin/macOS is now implemented
via IOKit in gopsutil v4, resolving the 'not implemented yet' error
that previously occurred on macOS. The graceful degradation code in
disk.go is preserved as a safety net for unsupported platforms.

Agent-Logs-Url: https://github.com/infinilabs/framework/sessions/35abcace-ca46-4a40-8c8e-85acd8df15e2

Co-authored-by: medcl <64487+medcl@users.noreply.github.com>
Add a new 'host/overall' metric collector that computes composite system
health by evaluating CPU, memory, disk capacity, and disk I/O utilization.

Each subsystem is classified as green/yellow/red based on configurable
thresholds (default: 70% yellow, 90% red). The overall status reflects
the worst-performing subsystem, identifying the bottleneck.

Event payload includes per-subsystem breakdown with status and percentage,
plus the top-level status and bottleneck indicator for dashboard display.

New types in core/host:
- SubsystemHealth: per-subsystem status
- OverallStatus: composite health with bottleneck
- ClassifyHealth(): threshold-based health classification
- HealthPriority(): health comparison helper

Configuration (under metrics.overall):
- enabled: true/false
- yellow_threshold: percentage (default 70)
- red_threshold: percentage (default 90)

Agent-Logs-Url: https://github.com/infinilabs/framework/sessions/4eaec09e-6e0d-4805-bcfa-cc88c0a3a028

Co-authored-by: medcl <64487+medcl@users.noreply.github.com>
Address code review feedback: extract the 10s interval constant into a
configurable IntervalSeconds field (default: 10) on the Metric struct.

Agent-Logs-Url: https://github.com/infinilabs/framework/sessions/4eaec09e-6e0d-4805-bcfa-cc88c0a3a028

Co-authored-by: medcl <64487+medcl@users.noreply.github.com>
…eck fields

Remove status/bottleneck/threshold classification from the overall utilization
metric - the front layer handles green/yellow/red display logic.

Add network throughput (bytes/sec in+out) to complete the subsystem coverage:
CPU, memory, disk capacity, disk I/O, and network.

Remove now-unused types from core/host/host.go: SubsystemHealth, OverallStatus,
ClassifyHealth, HealthPriority, DefaultYellowThreshold, DefaultRedThreshold.

Agent-Logs-Url: https://github.com/infinilabs/framework/sessions/35d0ec62-94eb-4e02-a560-9dc9d927e75b

Co-authored-by: medcl <64487+medcl@users.noreply.github.com>
Network utilization is now reported as a percentage of configured bandwidth
(network_bandwidth_mbps, default 1000 Mbps). Uses max(in, out) throughput
relative to bandwidth, consistent with all other subsystems reporting
used_percent in the overall metric.

Agent-Logs-Url: https://github.com/infinilabs/framework/sessions/bb90d5d0-209f-47c0-b000-9550e3005071

Co-authored-by: medcl <64487+medcl@users.noreply.github.com>
The file is already in .gitignore but was still being tracked.
This removes it from the index so future changes are properly ignored.

Agent-Logs-Url: https://github.com/infinilabs/framework/sessions/314530fc-4e7a-4937-be8f-3aa2679ba12a

Co-authored-by: medcl <64487+medcl@users.noreply.github.com>
Copilot AI and others added 2 commits April 19, 2026 07:38
Restore the file with N/A placeholder values instead of removing it
from tracking entirely. The file remains in .gitignore so local
build-regenerated values won't be accidentally committed.

Agent-Logs-Url: https://github.com/infinilabs/framework/sessions/5c37d476-1ae1-4466-963d-56cec89b93a3

Co-authored-by: medcl <64487+medcl@users.noreply.github.com>
The file should remain gitignored as it's regenerated during builds
with build-specific values (commit hashes, timestamps).

Agent-Logs-Url: https://github.com/infinilabs/framework/sessions/4f289f27-76e1-42f7-a474-d0f40cf1c518

Co-authored-by: medcl <64487+medcl@users.noreply.github.com>
@medcl medcl changed the title Reviewing host metrics implementation and testing on darwin refactor: update gopsutil to v4, add overall host metrics Apr 22, 2026
@medcl medcl marked this pull request as ready for review April 22, 2026 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants