Skip to content

Conversation

@camilamacedo86
Copy link
Contributor

@camilamacedo86 camilamacedo86 commented Jan 11, 2026

When upgrading OLM from standard (Helm runtime) to experimental (Boxcutter runtime), the BoxcutterStorageMigrator creates a ClusterExtensionRevision from the existing Helm release. However, the migrated revision was created without status conditions, causing a race condition where it wasn't recognized as "Installed".

This fix sets an initial Succeeded=True status on migrated revisions, ensuring they're immediately recognized and allowing version upgrades to proceed correctly after OLM upgrades

Real-World Scenario

What We're Doing

Day 1: You install OLM standard edition and install PostgreSQL operator v2.0.0

  • Everything works great ✅
  • Your databases are running ✅

Day 2: You want to try the new Boxcutter runtime (experimental features)

  • You upgrade OLM from standard to experimental
  • OLM upgrade completes successfully ✅
  • PostgreSQL still runs fine ✅

Day 3: PostgreSQL v2.1.0 is released with bug fixes you need

  • We try to upgrade PostgreSQL from v2.0.0 → v2.1.0
  • It FAILS
  • Error: "Cannot determine installed version"
  • You're stuck on old version with bugs ❌

What Was Happening (Before Fix)

The Migration Process

When OLM upgrades from Helm to Boxcutter:

  1. Migration starts - OLM needs to convert Helm storage to Boxcutter storage
  2. Creates ClusterExtensionRevision - Copies all your installed manifests
  3. BUT - Forgets to mark it as "successfully installed"
    4. Race condition - System checks what's installed before status is set
  4. System thinks - "Nothing is installed yet, this is still rolling out"
  5. Result - Can't compute upgrade path without knowing current version

What's Fixed Now

After the Fix

When OLM upgrades from Helm to Boxcutter:

  1. Migration starts - Same as before
  2. Creates ClusterExtensionRevision - Same as before
  3. ✨ NEW: Immediately marks it as succeeded - No waiting!
  4. System checks - "What's installed?"
  5. System knows - "v2.0.0 is installed and working"
  6. Result - Upgrade to v2.1.0 works perfectly!

Copilot AI review requested due to automatic review settings January 11, 2026 18:35
@netlify
Copy link

netlify bot commented Jan 11, 2026

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit 353a41e
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/697cecaf39fce6000839fc15
😎 Deploy Preview https://deploy-preview-2440--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@camilamacedo86 camilamacedo86 changed the title 🐛 (fix) Helm to Boxcutter migration during OLM upgrade WIP 🐛 (fix) Helm to Boxcutter migration during OLM upgrade Jan 11, 2026
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 11, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a race condition that occurs during the upgrade from standard OLM (Helm runtime) to experimental OLM (Boxcutter runtime). The issue arose because migrated ClusterExtensionRevisions were created without a Succeeded=True status condition, causing them not to be recognized as "Installed" until the ClusterExtensionRevision controller reconciled them. This timing gap led to version resolution failures during OLM upgrades.

Changes:

  • Added a new ClusterExtensionRevisionReasonMigrated constant for tracking migration status
  • Set initial Succeeded=True status condition on migrated revisions immediately after creation
  • Enhanced documentation explaining the race condition and its resolution

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
api/v1/clusterextensionrevision_types.go Added new ClusterExtensionRevisionReasonMigrated constant for status condition reasons
internal/operator-controller/applier/boxcutter.go Added status update logic to set Succeeded=True condition on migrated revisions with comprehensive documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@camilamacedo86 camilamacedo86 changed the title WIP 🐛 (fix) Helm to Boxcutter migration during OLM upgrade 🐛 (fix) Helm to Boxcutter migration during OLM upgrade Jan 11, 2026
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 11, 2026
@codecov
Copy link

codecov bot commented Jan 11, 2026

Codecov Report

❌ Patch coverage is 64.00000% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.48%. Comparing base (4d4f894) to head (353a41e).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
internal/operator-controller/applier/boxcutter.go 64.00% 10 Missing and 8 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2440      +/-   ##
==========================================
- Coverage   69.48%   69.48%   -0.01%     
==========================================
  Files         102      102              
  Lines        8249     8297      +48     
==========================================
+ Hits         5732     5765      +33     
- Misses       2063     2071       +8     
- Partials      454      461       +7     
Flag Coverage Δ
e2e 46.33% <0.00%> (-0.54%) ⬇️
experimental-e2e 13.30% <0.00%> (-0.09%) ⬇️
unit 57.56% <64.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@camilamacedo86 camilamacedo86 requested review from Copilot and removed request for ankitathomas, Copilot, joelanford, pedjak and perdasilva January 12, 2026 00:23
@camilamacedo86 camilamacedo86 changed the title 🐛 (fix) Helm to Boxcutter migration during OLM upgrade WIP 🐛 (fix) Helm to Boxcutter migration during OLM upgrade Jan 12, 2026
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 12, 2026
@camilamacedo86 camilamacedo86 changed the title WIP 🐛 (fix) Helm to Boxcutter migration during OLM upgrade 🐛 (fix) Helm to Boxcutter migration during OLM upgrade Jan 29, 2026
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 29, 2026
@camilamacedo86 camilamacedo86 changed the title 🐛 (fix) Helm to Boxcutter migration during OLM upgrade 🐛 (fix) Fix race condition in Helm to Boxcutter migration during OLM upgrades Jan 29, 2026
@camilamacedo86 camilamacedo86 force-pushed the fix-only-migration branch 2 times, most recently from 932f0af to 7ed8d21 Compare January 30, 2026 16:33
Copilot AI review requested due to automatic review settings January 30, 2026 16:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

When upgrading OLM from standard (Helm runtime) to experimental
(Boxcutter runtime), the BoxcutterStorageMigrator creates a
ClusterExtensionRevision from the existing Helm release. However,
the migrated revision was created without status conditions, causing
a race condition where it wasn't recognized as "Installed".

Assisted-by: CLAUDE
Copilot AI review requested due to automatic review settings January 30, 2026 16:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings January 30, 2026 17:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings January 30, 2026 17:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1509 to 1513
client.
On("List", mock.Anything, mock.AnythingOfType("*v1.ClusterExtensionRevisionList"), mock.Anything).
Run(func(args mock.Arguments) {
list := args.Get(1).(*ocv1.ClusterExtensionRevisionList)
list.Items = []ocv1.ClusterExtensionRevision{existingRev}
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test sets up List() but does not mock client.Get(). BoxcutterStorageMigrator.Migrate() currently calls Client.Get() when revisions exist and revision 1 is not already Succeeded=True, so this will trigger an unexpected mock call and fail the test. Either add a Get() expectation here or change ensureMigratedRevisionStatus to skip non-migrated revision 1 before calling ensureRevisionStatus (so no Get() occurs).

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@pedjak pedjak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 30, 2026
@tmshort
Copy link
Contributor

tmshort commented Jan 30, 2026

/approve

@openshift-ci
Copy link

openshift-ci bot commented Jan 30, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tmshort

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 30, 2026
@camilamacedo86
Copy link
Contributor Author

/override codecov/patch

@openshift-ci
Copy link

openshift-ci bot commented Jan 31, 2026

@camilamacedo86: Overrode contexts on behalf of camilamacedo86: codecov/patch

Details

In response to this:

/override codecov/patch

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 492c7e5 into operator-framework:main Jan 31, 2026
29 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants