Skip to content

Update Validator for /usr/bin symlink#2464

Open
JunAr7112 wants to merge 1 commit into
NVIDIA:mainfrom
JunAr7112:fix_symlink
Open

Update Validator for /usr/bin symlink#2464
JunAr7112 wants to merge 1 commit into
NVIDIA:mainfrom
JunAr7112:fix_symlink

Conversation

@JunAr7112
Copy link
Copy Markdown
Contributor

Description

This MR is in response to this issue: #1357. In certain scenarios, we might have a symlink with /usr/bin which causes os.Lstat() to be unable to properly parse the nvidia-smi location. Inside the validator container, the host root filesystem is mounted at /host. So the validator tries to resolve: /host/usr/bin/nvidia-smi
But because /host/usr/bin points to an absolute symlink target, /run/.../bin, the path resolution happens from the validator container’s root, not from /host. In other words, it looks for: /run/.../bin/nvidia-smi inside the container namespace, instead of /host/run/.../bin/nvidia-smi. So the validator falsely concludes that the host-installed driver is not present.

This change ensures that the /usr/bin symlink will be parsed correctly.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 15, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@JunAr7112 JunAr7112 force-pushed the fix_symlink branch 2 times, most recently from a345546 to 037a688 Compare May 15, 2026 23:18
Comment thread cmd/nvidia-validator/main.go Outdated
}
fileInfo, err := os.Lstat("/host/usr/bin/nvidia-smi")

nvidiaSMIPath := "/host/usr/bin/nvidia-smi"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this newly added logic to a new function which correctly resolves the nvidia-smi path? Then, we can easily write few unit-tests for it as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Let's please write unit tests in this PR.

@rahulait
Copy link
Copy Markdown
Contributor

Overall, LGTM. Just a minor suggestion to have a separate function for it and also add unit-tests for it.

Copy link
Copy Markdown
Contributor

@cdesiniotis cdesiniotis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend we use existing packages for symlink resolution within a root. For example, we could use moby's FollowSymlinkInScope or https://pkg.go.dev/github.com/cyphar/filepath-securejoin. The former will probably suffice since the host path to nvidia-smi is assumed to be trustworthy in this context.

Comment thread cmd/nvidia-validator/main.go Outdated
}
fileInfo, err := os.Lstat("/host/usr/bin/nvidia-smi")

nvidiaSMIPath := "/host/usr/bin/nvidia-smi"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Let's please write unit tests in this PR.

Comment thread cmd/nvidia-validator/main.go Outdated
if filepath.IsAbs(target) {
nvidiaSMIPath = filepath.Join("/host", strings.TrimPrefix(target, "/"), "nvidia-smi")
} else {
nvidiaSMIPath = filepath.Join("/host/usr", target, "nvidia-smi")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not following this logic. For example, what happens if /host/usr/bin/nvidia-smi is a symlink to ../../nvidia-smi?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try using symlinkinScope here. wrt the current format, if /usr/bin points to an absolute host path, it prefixes /host; if it is relative, it resolves it under /host/usr

Signed-off-by: Arjun <agadiyar@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants