From ccd52f81e1a539e182c4f7fcfdff2ad58f3fb071 Mon Sep 17 00:00:00 2001
From: Komh <mail@guojing.io>
Date: Sun, 26 Apr 2026 02:55:40 +0000
Subject: [PATCH 1/2] [virtualization] Cold VM Migration from VMware Hangs at
 Conversion-Progress Reporting

---
 ..._Hangs_at_Conversion_Progress_Reporting.md | 112 ++++++++++++++++++
 1 file changed, 112 insertions(+)
 create mode 100644 docs/en/solutions/Cold_VM_Migration_from_VMware_Hangs_at_Conversion_Progress_Reporting.md
diff --git a/docs/en/solutions/Cold_VM_Migration_from_VMware_Hangs_at_Conversion_Progress_Reporting.md b/docs/en/solutions/Cold_VM_Migration_from_VMware_Hangs_at_Conversion_Progress_Reporting.md
new file mode 100644
index 000000000..cb33f6b12
--- /dev/null
+++ b/docs/en/solutions/Cold_VM_Migration_from_VMware_Hangs_at_Conversion_Progress_Reporting.md
@@ -0,0 +1,112 @@
+---
+kind:
+   - Troubleshooting
+products:
+   - Alauda Container Platform
+ProductsVersion:
+   - 4.1.0,4.2.x
+---
+## Issue
+
+A cold migration of a VMware VM into ACP Virtualization stalls at the final step. The migration plan sits at "converting…" forever. The `virt-v2v` pod in the target namespace completes its conversion work and opens an HTTP server to report the resulting VM XML, but the migration controller never completes the plan. Representative symptoms:
+
+- `virt-v2v` pod log ends with `Starting server on:8080`, no crash, no further progress.
+- Migration-controller log in the platform-migration namespace reports:
+
+  ```text
+  msg="Failed to update conversion progress"
+  error="Get \"http://10.128.1.34:2112/metrics\": dial tcp 10.128.1.34:2112: i/o timeout"
+  ```
+
+## Root Cause
+
+The VM-from-VMware workflow has two pods in two namespaces:
+
+- The **migration controller** (Forklift controller) runs in the platform migration namespace (install-specific; on ACP it's the Virtualization migration component's namespace).
+- The **virt-v2v + vddk** conversion pod runs in the **target** namespace — where the imported VM will live.
+
+At the end of conversion, virt-v2v exposes two HTTP endpoints to the controller:
+
+| Port | Purpose |
+|---|---|
+| `8080` | serves the produced VM XML (for the controller to read and create the VMI) |
+| `2112` | exposes conversion progress metrics |
+
+If the **target** namespace has a default-deny NetworkPolicy (or any policy that doesn't explicitly admit ingress from the migration-controller namespace), both of those endpoints are unreachable. The controller times out on `:2112/metrics`, the plan never reads the finished XML from `:8080`, and the migration hangs indefinitely.
+
+Common-case predicate: the target namespace carries a standard "deny-by-default + allow-from-same-namespace + allow-from-ingress + allow-from-monitoring" set of policies — which is a reasonable security baseline but doesn't know about the migration controller.
+
+## Resolution
+
+Admit ingress from the migration-controller namespace on the target namespace. Apply only while migrations are active, or keep it as a standing policy if the target namespace routinely receives VM imports.
+
+1. **Identify the migration-controller namespace.** It carries the virtualization migration controller pod. The exact name depends on the ACP Virtualization install:
+
+   ```bash
+   # Find the controller pod by label (common selector)
+   kubectl get pod -A -l app.kubernetes.io/component=forklift-controller -o wide
+   # Or by deployment name pattern
+   kubectl get pod -A | grep -E 'forklift.*controller|migration.*controller'
+   ```
+
+   Whatever namespace shows up is the one to admit.
+
+2. **Add the allow-ingress policy in the target namespace.** Pair the namespace selector with a pod selector so only the conversion pod is reachable:
+
+   ```yaml
+   apiVersion: networking.k8s.io/v1
+   kind: NetworkPolicy
+   metadata:
+     name: allow-from-migration-controller
+     namespace: <target-ns>
+   spec:
+     podSelector:
+       matchExpressions:
+         - key: forklift.konveyor.io/plan
+           operator: Exists
+     policyTypes:
+       - Ingress
+     ingress:
+       - from:
+           - namespaceSelector:
+               matchLabels:
+                 kubernetes.io/metadata.name: <migration-controller-ns>
+         ports:
+           - protocol: TCP
+             port: 8080
+           - protocol: TCP
+             port: 2112
+   ```
+
+   The `podSelector` scopes the rule to the virt-v2v pods only (they carry the `forklift.konveyor.io/plan` label). This keeps the rest of the namespace's default-deny posture intact.
+
+3. **Re-run the plan.** The controller picks up the next reconcile within ~10s and completes. No need to restart anything.
+
+4. **For a standing solution**, include the allow-from-migration-controller policy in the namespace template / project template used for VM-import landing zones. Once a team knows their namespace will receive migrations, the policy should be pre-provisioned.
+
+## Diagnostic Steps
+
+Confirm the controller is timing out on the metrics endpoint:
+
+```bash
+kubectl -n <migration-controller-ns> logs deploy/<controller> --tail=200 \
+  | grep -i 'Failed to update conversion progress'
+```
+
+Confirm virt-v2v reached the "ready-to-serve" state:
+
+```bash
+kubectl -n <target-ns> get pod -l forklift.konveyor.io/plan -o wide
+kubectl -n <target-ns> logs <virt-v2v-pod> -c virt-v2v --tail=20
+```
+
+If the log ends with `Starting server on:8080` and no further activity, the conversion is done — the issue is network reachability, not the conversion itself.
+
+Verify the namespace's inbound policy set:
+
+```bash
+kubectl -n <target-ns> get networkpolicy
+kubectl -n <target-ns> describe networkpolicy
+```
+
+A "deny-all-by-default" rule combined with no explicit allow from the migration-controller namespace matches the failure mode. Apply the NetworkPolicy above; the plan resumes within one reconcile.

From 6c3bd4c56798928d158ddf6581488941eea8efca Mon Sep 17 00:00:00 2001
From: Komh <mail@guojing.io>
Date: Thu, 14 May 2026 17:29:49 +0800
Subject: [PATCH 2/2] [kb] Cold VM Migration Hangs Because NetworkPolicy in the
 Target Namespace Blocks the virt-v2v Pod

---
 ...arget_Namespace_Blocks_the_virt_v2v_Pod.md |  96 +++++++++++++++
 ..._Hangs_at_Conversion_Progress_Reporting.md | 112 ------------------
 2 files changed, 96 insertions(+), 112 deletions(-)
 create mode 100644 docs/en/solutions/Cold_VM_Migration_Hangs_Because_NetworkPolicy_in_the_Target_Namespace_Blocks_the_virt_v2v_Pod.md
 delete mode 100644 docs/en/solutions/Cold_VM_Migration_from_VMware_Hangs_at_Conversion_Progress_Reporting.md

diff --git a/docs/en/solutions/Cold_VM_Migration_Hangs_Because_NetworkPolicy_in_the_Target_Namespace_Blocks_the_virt_v2v_Pod.md b/docs/en/solutions/Cold_VM_Migration_Hangs_Because_NetworkPolicy_in_the_Target_Namespace_Blocks_the_virt_v2v_Pod.md
new file mode 100644
index 000000000..88241ca8c
--- /dev/null
+++ b/docs/en/solutions/Cold_VM_Migration_Hangs_Because_NetworkPolicy_in_the_Target_Namespace_Blocks_the_virt_v2v_Pod.md
@@ -0,0 +1,96 @@
+---
+kind:
+   - Troubleshooting
+products:
+   - Alauda Container Platform
+ProductsVersion:
+   - 4.1.0,4.2.x
+---
+
+# Cold VM Migration Hangs Because NetworkPolicy in the Target Namespace Blocks the virt-v2v Pod
+## Issue
+
+A cold VM migration from VMware into ACP virtualization (KubeVirt),
+driven by the Alauda Build of Forklift Operator, hangs at the
+conversion-progress step. The `forklift-controller` logs repeat
+`Failed to update conversion progress`, and the migration's
+`virt-v2v` pod in the target namespace logs that it is serving but
+seeing no consumer reach it.
+
+## Root Cause
+
+At the conversion step, the per-VM `virt-v2v` pod is launched in the
+**target namespace** (the namespace where the imported VM will live)
+and exposes the VM's XML on port `8080`. The `forklift-controller`
+runs in `konveyor-forklift` and polls that endpoint to read
+conversion progress.
+
+If the target namespace has a `NetworkPolicy` that denies ingress by
+default (or restricts ingress to a label set that does not include
+the Forklift namespace), the controller's TCP connection to the
+`virt-v2v` pod's `:8080` is dropped. The migration pod is healthy in
+isolation; the controller cannot observe its progress, so it never
+advances past the conversion step.
+
+## Resolution
+
+Add a `NetworkPolicy` to the **target namespace** allowing ingress on
+port `8080` from the `konveyor-forklift` namespace:
+
+```yaml
+apiVersion: networking.k8s.io/v1
+kind: NetworkPolicy
+metadata:
+  name: allow-forklift-controller
+  namespace: <target-namespace>
+spec:
+  podSelector: {}              # apply to virt-v2v pods (and any others)
+  policyTypes:
+    - Ingress
+  ingress:
+    - from:
+        - namespaceSelector:
+            matchLabels:
+              kubernetes.io/metadata.name: konveyor-forklift
+      ports:
+        - protocol: TCP
+          port: 8080
+```
+
+Apply, then re-run the migration plan. The controller's polling reaches
+`virt-v2v` and conversion progress advances normally.
+
+If existing policies in the target namespace use a different selector
+shape (named labels, allow-all-ingress with a deny-list, etc.), add a
+single allow rule equivalent to the above; the rule is purely
+additive.
+
+## Diagnostic Steps
+
+Confirm where the controller and the conversion pod live:
+
+```bash
+kubectl -n konveyor-forklift get deploy forklift-controller
+kubectl -n <target-namespace> get pod -l job-name -o wide   # the virt-v2v pod is a Job-spawned pod
+```
+
+Confirm a NetworkPolicy is in fact blocking the path:
+
+```bash
+kubectl -n <target-namespace> get networkpolicy
+# Pick the policy and check its ingress.from — if no namespaceSelector
+# admits `konveyor-forklift`, that is the block.
+```
+
+Check the controller logs for `Failed to update conversion progress`
+or connection-refused / connection-timeout messages against the target
+pod IP:
+
+```bash
+kubectl -n konveyor-forklift logs deploy/forklift-controller | \
+  grep -E 'conversion progress|virt-v2v' | tail -20
+```
+
+A timeout from the controller's IP to the virt-v2v pod's IP confirms
+the policy block. A successful `:8080` response after the policy is in
+place confirms the fix.
diff --git a/docs/en/solutions/Cold_VM_Migration_from_VMware_Hangs_at_Conversion_Progress_Reporting.md b/docs/en/solutions/Cold_VM_Migration_from_VMware_Hangs_at_Conversion_Progress_Reporting.md
deleted file mode 100644
index cb33f6b12..000000000
--- a/docs/en/solutions/Cold_VM_Migration_from_VMware_Hangs_at_Conversion_Progress_Reporting.md
+++ /dev/null
@@ -1,112 +0,0 @@
----
-kind:
-   - Troubleshooting
-products:
-   - Alauda Container Platform
-ProductsVersion:
-   - 4.1.0,4.2.x
----
-## Issue
-
-A cold migration of a VMware VM into ACP Virtualization stalls at the final step. The migration plan sits at "converting…" forever. The `virt-v2v` pod in the target namespace completes its conversion work and opens an HTTP server to report the resulting VM XML, but the migration controller never completes the plan. Representative symptoms:
-
-- `virt-v2v` pod log ends with `Starting server on:8080`, no crash, no further progress.
-- Migration-controller log in the platform-migration namespace reports:
-
-  ```text
-  msg="Failed to update conversion progress"
-  error="Get \"http://10.128.1.34:2112/metrics\": dial tcp 10.128.1.34:2112: i/o timeout"
-  ```
-
-## Root Cause
-
-The VM-from-VMware workflow has two pods in two namespaces:
-
-- The **migration controller** (Forklift controller) runs in the platform migration namespace (install-specific; on ACP it's the Virtualization migration component's namespace).
-- The **virt-v2v + vddk** conversion pod runs in the **target** namespace — where the imported VM will live.
-
-At the end of conversion, virt-v2v exposes two HTTP endpoints to the controller:
-
-| Port | Purpose |
-|---|---|
-| `8080` | serves the produced VM XML (for the controller to read and create the VMI) |
-| `2112` | exposes conversion progress metrics |
-
-If the **target** namespace has a default-deny NetworkPolicy (or any policy that doesn't explicitly admit ingress from the migration-controller namespace), both of those endpoints are unreachable. The controller times out on `:2112/metrics`, the plan never reads the finished XML from `:8080`, and the migration hangs indefinitely.
-
-Common-case predicate: the target namespace carries a standard "deny-by-default + allow-from-same-namespace + allow-from-ingress + allow-from-monitoring" set of policies — which is a reasonable security baseline but doesn't know about the migration controller.
-
-## Resolution
-
-Admit ingress from the migration-controller namespace on the target namespace. Apply only while migrations are active, or keep it as a standing policy if the target namespace routinely receives VM imports.
-
-1. **Identify the migration-controller namespace.** It carries the virtualization migration controller pod. The exact name depends on the ACP Virtualization install:
-
-   ```bash
-   # Find the controller pod by label (common selector)
-   kubectl get pod -A -l app.kubernetes.io/component=forklift-controller -o wide
-   # Or by deployment name pattern
-   kubectl get pod -A | grep -E 'forklift.*controller|migration.*controller'
-   ```
-
-   Whatever namespace shows up is the one to admit.
-
-2. **Add the allow-ingress policy in the target namespace.** Pair the namespace selector with a pod selector so only the conversion pod is reachable:
-
-   ```yaml
-   apiVersion: networking.k8s.io/v1
-   kind: NetworkPolicy
-   metadata:
-     name: allow-from-migration-controller
-     namespace: <target-ns>
-   spec:
-     podSelector:
-       matchExpressions:
-         - key: forklift.konveyor.io/plan
-           operator: Exists
-     policyTypes:
-       - Ingress
-     ingress:
-       - from:
-           - namespaceSelector:
-               matchLabels:
-                 kubernetes.io/metadata.name: <migration-controller-ns>
-         ports:
-           - protocol: TCP
-             port: 8080
-           - protocol: TCP
-             port: 2112
-   ```
-
-   The `podSelector` scopes the rule to the virt-v2v pods only (they carry the `forklift.konveyor.io/plan` label). This keeps the rest of the namespace's default-deny posture intact.
-
-3. **Re-run the plan.** The controller picks up the next reconcile within ~10s and completes. No need to restart anything.
-
-4. **For a standing solution**, include the allow-from-migration-controller policy in the namespace template / project template used for VM-import landing zones. Once a team knows their namespace will receive migrations, the policy should be pre-provisioned.
-
-## Diagnostic Steps
-
-Confirm the controller is timing out on the metrics endpoint:
-
-```bash
-kubectl -n <migration-controller-ns> logs deploy/<controller> --tail=200 \
-  | grep -i 'Failed to update conversion progress'
-```
-
-Confirm virt-v2v reached the "ready-to-serve" state:
-
-```bash
-kubectl -n <target-ns> get pod -l forklift.konveyor.io/plan -o wide
-kubectl -n <target-ns> logs <virt-v2v-pod> -c virt-v2v --tail=20
-```
-
-If the log ends with `Starting server on:8080` and no further activity, the conversion is done — the issue is network reachability, not the conversion itself.
-
-Verify the namespace's inbound policy set:
-
-```bash
-kubectl -n <target-ns> get networkpolicy
-kubectl -n <target-ns> describe networkpolicy
-```
-
-A "deny-all-by-default" rule combined with no explicit allow from the migration-controller namespace matches the failure mode. Apply the NetworkPolicy above; the plan resumes within one reconcile.