Skip to content

Conversation

@shivansh-gohem
Copy link
Contributor

This PR implements the lifecycle management logic described in KREP-004 to resolve #789.

Note: This PR supersedes #959. I apologize for the noise on the previous PR; I encountered git history corruption while rebasing onto the new v0.8.0 changes and decided a clean branch was the safest path to ensure no legacy commits were lost.

Changes

  • OwnerReferences: Injected SetOwnerReferences when creating CRDs. This ensures Kubernetes Garbage Collection automatically cleans up CRDs (and their instances) when an RGD is deleted.
  • Finalizers: Added kro.run/finalizer to RGDs. This blocks deletion until the controller has successfully run its cleanup logic.
  • v0.8.0 Compatibility: Refactored the controller logic to align with the new pkg/graph architecture (migrated from removed methods like GetCRD()/GetDependencies() to the new .Meta field structure).

Implementation Details

  • Added handleFinalizer helper in controller_reconcile.go to manage the finalizer lock.
  • Updated Reconcile loop in controller.go to handle the deletion timestamp and ensure cleanup runs before removing the finalizer.
  • Updated crd generation to use runtime.DefaultUnstructuredConverter since GetCRD() was removed in the recent refactor.

Verification

  • make lint passed.
  • make test passed.
  • Verified manually that CRDs now have OwnerReferences pointing to their RGD.
  • Verified that RGDs now have the kro.run/finalizer attached.

Closes #789

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 29, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shivansh-gohem
Once this PR has been reviewed and has the lgtm label, please assign cheftako for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 29, 2026
@k8s-ci-robot
Copy link
Contributor

Hi @shivansh-gohem. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 29, 2026
mark.KindReady(crd.Status.AcceptedNames.Kind)
}

// TODO: the context that is passed here is tied to the reconciliation of the rgd, we might need to make
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not have been removed

@k8s-ci-robot k8s-ci-robot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jan 29, 2026
@shivansh-gohem shivansh-gohem force-pushed the feat/lifecycle-management-final branch from 905914d to 8566f94 Compare January 29, 2026 18:45
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jan 29, 2026
@shivansh-gohem shivansh-gohem force-pushed the feat/lifecycle-management-final branch from 8566f94 to 447371a Compare January 29, 2026 18:54
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 29, 2026
- Remove unclear FIX comments per review feedback
- Restore TODO comment about context decoupling
- Rename finalizer to descriptive kro.run/cleanup
- Implement instance cleanup grace period for issue 789
- Wait for all instances to be deleted before finalizer removal
BREAKING CHANGE: Finalizer renamed to kro.run/cleanup
@shivansh-gohem shivansh-gohem force-pushed the feat/lifecycle-management-final branch from 447371a to ad5fcd6 Compare January 29, 2026 18:58
@shivansh-gohem
Copy link
Contributor Author

Hi @jakobmoellerdev ,

Thank you for the detailed review feedback! I've addressed all your comments and cleaned up the code!! I want to be transparent - I use AI assistance primarily for text formatting, documentation, and understanding patterns in the codebase. Will not repeat further mistake in future.

Copy link
Member

@jakobmoellerdev jakobmoellerdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this feature is still controversial in the community, it is okay from an implementation perspective.

The issue still lies in the fact that deletion of CRDs is considered by many to be too dangerous to happen during cleanup. If you want to bring your point and get it merged, we will need to agree here with at least 2 other maintainers to back this.

I will be transparent here and say that it it will be hard to push it through without considering any of the tradeoffs in the ADR.

/cc @a-hilaly

@k8s-ci-robot k8s-ci-robot requested a review from a-hilaly January 29, 2026 19:08
@a-hilaly
Copy link
Member

i had now idea this was being worked on.

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 29, 2026
@shivansh-gohem
Copy link
Contributor Author

Hi @jakobmoellerdev and @a-hilaly ,

I think there might be some confusion about what this PR does. Let me clarify:

What This PR Does ✅
This PR implements instance cleanup grace period (issue #789) by:

(i)Adding a finalizer (kro.run/cleanup) to RGDs
(ii)Waiting for all instances to be deleted before removing the finalizer
(iii)Requeuing every 5 seconds while instances still exist
What This PR Does NOT Do ❌
(i)This PR does NOT change CRD deletion behavior
(ii)CRD deletion still respects the existing allowCRDDeletion flag
(iii)No changes to the controversial CRD deletion logic

The Problem This Solves
Currently, when an RGD is deleted:

(i)The microcontroller shuts down immediately
(ii)Instances become orphaned (no reconciliation)
(iii)Users requested a grace period to allow instance cleanup

After This PR
(i)RGD deletion waits for instances to be cleaned up first
(ii)Reconciliation continues during the grace period
(iii)Only after instances are gone does cleanup proceed

This is purely about instance lifecycle management, not CRD deletion policy. The implementation follows standard Kubernetes finalizer patterns.

Happy to discuss the approach or make any changes needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support GVK reconciliation for a certain time after RGD is deleted

4 participants