[observability] Diagnose NodeClockNotSynchronising on ACP Ubuntu nodes#485
[observability] Diagnose NodeClockNotSynchronising on ACP Ubuntu nodes#485jing2uo wants to merge 2 commits into
Conversation
WalkthroughThis PR adds a new troubleshooting guide documenting how to diagnose and resolve the ChangesNodeClockNotSynchronising Troubleshooting Documentation
Estimated Code Review Effort🎯 1 (Trivial) | ⏱️ ~5 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
ae9cefb to
59629c3
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/en/solutions/Diagnose_NodeClockNotSynchronising_on_ACP_Ubuntu_nodes.md`:
- Line 47: Change the sentence that currently says the metric is “independent of
whether Prometheus is currently scraping” to clearly separate metric semantics
from query freshness: state that the metric value itself is produced by
adjtimex(2) and that a value of 0 corresponds to the STA_UNSYNC bit, but that
using promtool query instant (or any Prometheus query) will only return a sample
if the target has been scraped recently — results can be stale or missing even
though the kernel state exists. Update the wording around the
adjtimex(2)/STA_UNSYNC mention and add a short clause about scrape freshness and
staleness behavior for promtool query instant.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1ee6d99a-6a01-4b3c-9ce4-dcdba85eafed
📒 Files selected for processing (1)
docs/en/solutions/Diagnose_NodeClockNotSynchronising_on_ACP_Ubuntu_nodes.md
|
|
||
| ## Diagnostic Steps | ||
|
|
||
| Confirm first that the kernel itself reports the clock as unsynchronised — that is the actual condition the alert is reacting to. The metric is produced from `adjtimex(2)` so its value is independent of whether Prometheus is currently scraping; a value of `0` corresponds to the `STA_UNSYNC` bit being set. Reference syntax for the in-cluster query (substitute the installed prometheus pod name and node label): |
There was a problem hiding this comment.
Clarify scrape-dependence of the queried value.
At Line 47, saying the value is “independent of whether Prometheus is currently scraping” is misleading for operators using promtool query instant; the returned sample still depends on successful recent scrapes (or can be stale/missing). Please reword to separate metric semantics from query freshness.
✏️ Proposed wording
-Confirm first that the kernel itself reports the clock as unsynchronised — that is the actual condition the alert is reacting to. The metric is produced from `adjtimex(2)` so its value is independent of whether Prometheus is currently scraping; a value of `0` corresponds to the `STA_UNSYNC` bit being set. Reference syntax for the in-cluster query (substitute the installed prometheus pod name and node label):
+Confirm first that the kernel itself reports the clock as unsynchronised — that is the actual condition the alert is reacting to. The metric is produced from `adjtimex(2)` and a value of `0` corresponds to the `STA_UNSYNC` bit being set; however, the value returned by `promtool query instant` reflects the most recent successfully scraped sample. Reference syntax for the in-cluster query (substitute the installed prometheus pod name and node label):🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/en/solutions/Diagnose_NodeClockNotSynchronising_on_ACP_Ubuntu_nodes.md`
at line 47, Change the sentence that currently says the metric is “independent
of whether Prometheus is currently scraping” to clearly separate metric
semantics from query freshness: state that the metric value itself is produced
by adjtimex(2) and that a value of 0 corresponds to the STA_UNSYNC bit, but that
using promtool query instant (or any Prometheus query) will only return a sample
if the target has been scraped recently — results can be stale or missing even
though the kernel state exists. Update the wording around the
adjtimex(2)/STA_UNSYNC mention and add a short clause about scrape freshness and
staleness behavior for promtool query instant.
新增一篇 ACP KB 文章。