Add drift detection and periodic resync for OpenStack resources#759
Open
eshulman2 wants to merge 7 commits into
Open
Add drift detection and periodic resync for OpenStack resources#759eshulman2 wants to merge 7 commits into
eshulman2 wants to merge 7 commits into
Conversation
5072ca0 to
3ebabf9
Compare
eshulman2
commented
Apr 20, 2026
Collaborator
|
Adding the |
85022c3 to
9138a1d
Compare
Add resyncPeriod to the code generator spec template allowing per-resource configuration of how frequently the controller re-reconciles even when no changes are detected. Add lastSyncTime to the status template to track when the last successful resync occurred. Update the adapter template and interface to expose GetResyncPeriod(), GetLastSyncTime(), and IsImported() methods.
Run make generate to update all generated code after API field and template changes.
Add --default-resync-period CLI flag to the manager for setting a global default resync period. Implement DetermineResyncPeriod to resolve the effective period using per-resource then global default precedence. Create resync scheduler package using wait.Jitter for jittered duration calculation. Modify shouldReconcile to support periodic resync by checking lastSyncTime against the effective resync period. Integrate resync requeue scheduling into reconcileNormal and update the status writer to set lastSyncTime on successful reconciliation. Wire the DefaultResyncPeriod from manager options through to all controllers. Add typed ExternallyDeleted signal on ReconcileStatus for safe detection of externally deleted OpenStack resources.
Add E2E tests covering resync period, resync jitter, resync disabled, and terminal error behavior during resync. Add integration tests for handling unmanaged resources during resync to verify that imported resources without resyncPeriod skip requeue scheduling.
Modify GetOrCreateOSResource to detect when a managed, non-imported resource has been deleted externally from OpenStack. Signal the caller via the typed ExternallyDeleted ReconcileStatus so it can clear status.id and trigger recreation. Restore the safety guard for unexpected nil returns from actuators. Fix Available condition to correctly transition from Unknown to False when a terminal error is present during resync. Include unit tests for external deletion handling and E2E tests covering both managed resource recreation and imported resource terminal error behavior.
Add user-facing documentation covering periodic resync configuration, drift detection behavior, and external deletion handling. Update the enhancement proposal to reflect the final two-tier resync period resolution design.
9138a1d to
da9fb9b
Compare
Replace the (nil, nil) sentinel pattern with a typed ExternallyDeleted signal on ReconcileStatus. This is a design improvement over the original proposal: using a dedicated signal prevents ambiguity with actuator bugs that might also return nil, and restores the safety guard for unexpected nil returns from GetOSResourceByID.
Contributor
Author
|
@mandre I revised the PR myself removing any indication of internal references I think it is ready for review |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements periodic resync and external deletion handling for the OpenStack Resource Controller. When enabled, ORC periodically reconciles resources with OpenStack to detect configuration drift and external deletions, automatically recreating managed resources that were deleted outside of ORC while surfacing terminal errors for imported resources.
Changes
API Changes
resyncPeriodfield to the code generator spec template for per-resource resync configurationlastSyncTimefield to the code generator status template to track successful reconciliation timestampsGetResyncPeriod(),GetLastSyncTime(), andIsImported()methods toAPIObjectAdapterinterfaceCLI & Manager
--default-resync-periodCLI flag with default value of 0 (disabled)ResyncConfigurableinterface to propagate global default to all controllersCore Resync Logic
internal/controllers/generic/resync/package with:DetermineResyncPeriod(): resolves effective period (per-resource → global → disabled)CalculateJitteredDuration(): wrapswait.Jitterwith 20% positive-only jitterShouldScheduleResync(): guards against scheduling for disabled/terminal/pending-requeue statesshouldReconcile()to trigger reconciliation when resync period has elapsedreconcileNormal()to schedule jittered requeue after successful reconciliationExternal Deletion Handling
GetOrCreateOSResource()to detect 404s when fetching bystatus.idExternallyDeletedsignal viaReconcileStatusfor managed non-imported resourcesClearStatusID()to status writer for clearing ID before recreationStatus Updates
UpdateStatus()to setlastSyncTimeonly on successful reconciliationAvailablecondition: terminal errors now setFalseinstead ofUnknownCode Generation
api.templateandadapter.templatewith new fields and methodsDocumentation
website/docs/user-guide/drift-detection.mdenhancements/drift-detection.mdwith implementation statusCommit structure
Design decisions
(nil, nil)return fromGetOrCreateOSResourceto signal external deletion. This was replaced with a typedExternallyDeletedflag onReconcileStatus, which avoids ambiguity with actuator bugs that might also return nil and restores a safety guard for unexpected nil returns fromGetOSResourceByID.wait.Jitterreuse:CalculateJitteredDurationwrapsk8s.io/apimachinery/pkg/util/wait.Jitterrather than reimplementing the same [base, base*1.2) jitter range.CommonStatus/CommonOptionstypes: These types were declared but never embedded — the code generator inlines the fields directly. They were removed to avoid misleading readers.Testing
go build,go vet,make lintall pass