Skip to content

Conversation

@matheuscscp
Copy link
Member

@matheuscscp matheuscscp commented Jan 14, 2026

Depends on: fluxcd/pkg#1069

Closes: #1300

@matheuscscp matheuscscp added the dependencies Pull requests that update a dependency label Jan 14, 2026
@matheuscscp matheuscscp force-pushed the helm4 branch 2 times, most recently from c5d3af6 to d8f6c12 Compare January 15, 2026 00:37
@matheuscscp matheuscscp force-pushed the helm4 branch 9 times, most recently from 3f4094f to b918924 Compare January 18, 2026 11:39
matheuscscp and others added 6 commits January 22, 2026 16:21
Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>
Add ServerSideApply field to Install, Upgrade, and Rollback specs
allowing users to control server-side apply behavior per action.

- Install.ServerSideApply: *bool (default based on UseHelm3Defaults)
- Upgrade.ServerSideApply: *string ("true", "false", "auto")
- Rollback.ServerSideApply: *string ("true", "false", "auto")

User-specified values take precedence over defaults. When not
specified, the existing default behavior is preserved.

Signed-off-by: cappyzawa <cappyzawa@gmail.com>
Add end-to-end tests to verify the ServerSideApply field works correctly
for install, upgrade, and rollback operations.

The tests verify that when serverSideApply is set, the Helm release
uses the SSA apply method (apply_method: ssa in the release secret).

Signed-off-by: cappyzawa <cappyzawa@gmail.com>
Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>
This adds the `.status.inventory` field to HelmRelease, similar to
Kustomization, to expose managed Kubernetes objects.

The inventory includes:
- Objects from the release manifest (with namespace complement)
- CRDs from the chart's crds/ directory

Helm hooks are excluded as they are ephemeral resources deleted
after execution.

Signed-off-by: cappyzawa <cappyzawa@gmail.com>
Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>
Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>
@matheuscscp matheuscscp force-pushed the helm4 branch 6 times, most recently from b0382d2 to ac5b312 Compare January 22, 2026 20:41
@matheuscscp matheuscscp force-pushed the helm4 branch 4 times, most recently from 203b6bb to 4f120a1 Compare January 22, 2026 21:45
@matheuscscp
Copy link
Member Author

matheuscscp commented Jan 22, 2026

Server-Side Apply (SSA) vs Client-Side Apply (CSA) E2E Tests

This PR adds comprehensive e2e tests to verify the SSA/CSA behavior of HelmRelease installs and upgrades. The tests validate that the .spec.install.serverSideApply and .spec.upgrade.serverSideApply fields work correctly by inspecting the Kubernetes managed fields on deployed resources.

Test Overview

Test Install Upgrade Purpose
client-side apply upgrade test CSA (false) CSA (auto/default) Verify CSA is used throughout when SSA is disabled
SSA install with CSA upgrade test SSA (true) CSA (disabled) Verify switching from SSA to CSA on upgrade
CSA install with SSA upgrade test CSA (false) SSA (enabled) Verify switching from CSA to SSA on upgrade
SSA to CSA field removal test SSA (true) CSA (disabled) Verify field removal behavior when switching from SSA to CSA

How We Verify SSA vs CSA

The tests check the managedFields metadata on the Deployment created by the Helm chart:

  • SSA (Server-Side Apply): Shows helm-controller: Apply operation
  • CSA (Client-Side Apply): Shows helm-controller: Update operation

Upgrades are triggered by patching .spec.values.podAnnotations.upgrade-timestamp, which changes the Pod template and forces a real Helm upgrade.


Test 1: CSA Install with CSA Upgrade ("auto", the default for .spec.upgrade.serverSideApply)

When .spec.install.serverSideApply: false and no explicit upgrade setting, the upgrade inherits the install behavior:

helmrelease.helm.toolkit.fluxcd.io/no-server-side-apply created
helmrelease.helm.toolkit.fluxcd.io/no-server-side-apply condition met
>>> Checking managed fields after install
Found 2 managed field entries
Managers before upgrade:
  - helm-controller: Update
  - kube-controller-manager: Update
helm-controller operations: Update=1, Apply=0
PASS: Install used CSA (Update operation)
>>> Triggering upgrade via patch (expecting CSA)
helmrelease.helm.toolkit.fluxcd.io/no-server-side-apply patched
>>> Waiting for upgrade done
helmrelease.helm.toolkit.fluxcd.io/no-server-side-apply condition met
Pod annotation upgrade-timestamp: 2026-01-22T21:50:25Z
>>> Checking managed fields after upgrade
Managers after upgrade:
  - helm-controller: Update
  - kube-controller-manager: Update
helm-controller operations: Update=1, Apply=0
PASS: Upgrade used CSA (Update operation)
helmrelease.helm.toolkit.fluxcd.io "no-server-side-apply" deleted

Test 2: SSA Install → CSA Upgrade

When .spec.install.serverSideApply: true and .spec.upgrade.serverSideApply: disabled:

helmrelease.helm.toolkit.fluxcd.io/ssa-install-no-ssa-upgrade created
helmrelease.helm.toolkit.fluxcd.io/ssa-install-no-ssa-upgrade condition met
>>> Checking managed fields after install (expecting SSA)
Found 2 managed field entries
Managers after install:
  - helm-controller: Apply
  - kube-controller-manager: Update
Fields managed by helm-controller:
Apply: 57 fields
helm-controller operations: Apply=1, Update=0
PASS: Install used SSA (Apply operation)
>>> Triggering upgrade via patch (expecting CSA)
helmrelease.helm.toolkit.fluxcd.io/ssa-install-no-ssa-upgrade patched
>>> Waiting for upgrade. done
helmrelease.helm.toolkit.fluxcd.io/ssa-install-no-ssa-upgrade condition met
Pod annotation upgrade-timestamp: 2026-01-22T21:50:34Z
>>> Checking managed fields after upgrade (expecting CSA)
Managers after upgrade:
  - helm-controller: Apply
  - helm-controller: Update
  - kube-controller-manager: Update
Fields managed by helm-controller:
Apply: 57 fields
Update: f:spec.f:template.f:metadata.f:annotations.f:upgrade-timestamp
helm-controller operations: Apply=1, Update=1
PASS: Upgrade used CSA (Update operation)
helmrelease.helm.toolkit.fluxcd.io "ssa-install-no-ssa-upgrade" deleted

Key observation: After the CSA upgrade, we see both Apply and Update entries from helm-controller. The original SSA install still owns 57 fields via Apply, while the CSA upgrade added a new Update entry for just the changed field (f:upgrade-timestamp).


Test 3: CSA Install → SSA Upgrade

When .spec.install.serverSideApply: false and .spec.upgrade.serverSideApply: enabled:

helmrelease.helm.toolkit.fluxcd.io/no-ssa-install-ssa-upgrade created
helmrelease.helm.toolkit.fluxcd.io/no-ssa-install-ssa-upgrade condition met
>>> Checking managed fields after install (expecting CSA)
Found 2 managed field entries
Managers after install:
  - helm-controller: Update
  - kube-controller-manager: Update
Fields managed by helm-controller:
Update: 80 fields
helm-controller operations: Apply=0, Update=1
PASS: Install used CSA (Update operation)
>>> Triggering upgrade via patch (expecting SSA)
helmrelease.helm.toolkit.fluxcd.io/no-ssa-install-ssa-upgrade patched
>>> Waiting for upgrade. done
helmrelease.helm.toolkit.fluxcd.io/no-ssa-install-ssa-upgrade condition met
Pod annotation upgrade-timestamp: 2026-01-22T21:50:46Z
>>> Checking managed fields after upgrade (expecting SSA)
Managers after upgrade:
  - helm-controller: Apply
  - kube-controller-manager: Update
Fields managed by helm-controller:
Apply: 58 fields
helm-controller operations: Apply=1, Update=0
PASS: Upgrade used SSA (Apply operation)
helmrelease.helm.toolkit.fluxcd.io "no-ssa-install-ssa-upgrade" deleted

Key observation: When switching from CSA to SSA, the SSA upgrade takes over ownership entirely. The previous CSA Update entry (80 fields) disappears and is replaced by the SSA Apply entry (58 fields).


Test 4: SSA Install → CSA Upgrade with Field Removal

This test verifies what happens when you remove a field from .spec.values while switching from SSA to CSA. The field WILL be removed from the actual object because Helm tracks field ownership and cleans up removed fields.

helmrelease.helm.toolkit.fluxcd.io/ssa-to-csa-field-removal created
helmrelease.helm.toolkit.fluxcd.io/ssa-to-csa-field-removal condition met
>>> Checking managed fields after install (expecting SSA)
Found 2 managed field entries
Managers after install:
  - helm-controller: Apply
  - kube-controller-manager: Update
Fields managed by helm-controller:
Apply: 58 fields
helm-controller operations: Apply=1, Update=0
PASS: Install used SSA (Apply operation)
>>> Verifying SSA-owned annotation after install
Pod annotation ssa-owned-field: this-should-persist-after-csa-upgrade
PASS: SSA-owned annotation present after install
>>> Triggering upgrade by removing podAnnotations from values
helmrelease.helm.toolkit.fluxcd.io/ssa-to-csa-field-removal patched
>>> Waiting for upgrade. done
helmrelease.helm.toolkit.fluxcd.io/ssa-to-csa-field-removal condition met
>>> Checking managed fields after upgrade
Managers after upgrade:
  - helm-controller: Apply
  - kube-controller-manager: Update
Fields managed by helm-controller:
Apply: 57 fields
helm-controller operations: Apply=1, Update=0
>>> Verifying field count decreased
Field count before: 58
Field count after: 57
PASS: Field count decreased from 58 to 57
>>> Verifying SSA-owned field was removed after upgrade
Pod annotation ssa-owned-field after upgrade: ''
PASS: SSA-owned field was removed after upgrade

This confirms that removing a field from HelmRelease values WILL
remove it from the actual object, even when switching from SSA to CSA.
Helm properly tracks field ownership and cleans up removed fields.
helmrelease.helm.toolkit.fluxcd.io "ssa-to-csa-field-removal" deleted

Key observation: When you remove a field from your HelmRelease values, Helm removes it from the actual Kubernetes object regardless of whether you're using SSA or CSA. The field count dropped from 58 to 57, and the ssa-owned-field annotation was removed. This is the expected and desired behavior - Helm properly garbage collects fields that are no longer in the rendered manifests.


Understanding the Field Counts

The difference in field counts between SSA and CSA is expected:

  • SSA (Apply): 57-58 fields - Server-side apply is precise; it only tracks the specific fields the client explicitly claims ownership of. The controller sends a partial object.

  • CSA (Update): 80 fields - Client-side apply does a read-modify-write cycle with the full object, so it tends to claim ownership of more fields.

The 57 vs 58 difference is the podAnnotations.upgrade-timestamp field we added during the upgrade.

The interesting behavior in Test 3 shows SSA's field ownership model at work: when you switch from CSA (80 fields) to SSA (58 fields), SSA "takes over" with its more precise field set and the previous CSA Update entry disappears entirely.

@matheuscscp
Copy link
Member Author

Issue: All HelmReleases upgraded on controller restart

When upgrading the helm-controller to this branch, all existing HelmReleases are upgraded even though nothing changed in their specs. This is problematic for production environments.

Observed Behavior

After updating the helm-controller deployment image, the logs show:

"release not managed by controller: release not observed to be made for object"

This appears for all HelmReleases, followed by an upgrade action for each one.

Root Cause Analysis

1. internal/reconcile/state.go:119-126:

cur := req.Object.Status.History.Latest()
if err := action.VerifyReleaseObject(cur, rls); err != nil {
    if interrors.IsOneOf(err, action.ErrReleaseDigest, action.ErrReleaseNotObserved) {
        return ReleaseState{Status: ReleaseStatusUnmanaged, Reason: err.Error()}, nil
    }
}

2. internal/action/verify.go:130-151 - VerifyReleaseObject:

func VerifyReleaseObject(snapshot *v2.Snapshot, rls *helmrelease.Release) error {
    relDig, err := digest.Parse(snapshot.Digest)
    verifier := relDig.Verifier()
    
    obs := release.ObserveRelease(rls)
    obs.OCIDigest = snapshot.OCIDigest
    
    if err = obs.Encode(verifier); err != nil { ... }
    if !verifier.Verified() {
        return ErrReleaseNotObserved  // <-- This is the error
    }
}

3. internal/release/observation.go:63-84 - The Observation struct that gets hashed includes:

  • Name, Version, Namespace
  • Info (timestamps, status)
  • ChartMetadata (name, version, etc.)
  • Config (values)
  • Manifest (rendered templates)
  • Hooks

The Problem

The controller computes a digest of the Helm release (including the manifest and chart metadata) and compares it to the stored snapshot.Digest in Status.History.

When upgrading from Helm v3 to Helm v4, the release data structure likely changed (different manifest format, chart metadata fields, hook structures, etc.), causing the digest to no longer match. The controller interprets this as:

  1. ErrReleaseNotObserved ("release not observed to be made for object")
  2. ReleaseStatusUnmanaged
  3. → Triggers an upgrade to "take ownership"

This is a breaking change when upgrading the helm-controller - all existing releases will be re-upgraded because their digests no longer verify against the stored snapshots.

@matheuscscp
Copy link
Member Author

matheuscscp commented Jan 22, 2026

Proposed Solution for Digest Verification Migration

Problem

When upgrading helm-controller from Helm v3 to v4, the digest computation for release snapshots changes due to structural differences in the Helm release objects. This causes VerifyReleaseObject to fail with ErrReleaseNotObserved, which the controller interprets as an "unmanaged release" and triggers an upgrade for every existing HelmRelease.

Root Cause

The Observation struct is JSON-encoded to compute a digest stored in Snapshot.Digest. When Helm v4 changes the structure of helmrelease.Release, helmrelease.Info, helmrelease.Hook, or chart.Metadata, the JSON encoding produces a different result, causing digest verification to fail even though the release is legitimately managed by the controller.

Solution

Use the existing Snapshot.APIVersion field (which was provisionally added for exactly this purpose) to version the digest format and skip verification for legacy snapshots.

Changes

9b0ff16

How It Works

  1. New snapshots created after this change will have APIVersion: "v2" - full digest verification applies
  2. Legacy snapshots (no APIVersion field) - digest verification is skipped
  3. Ownership still verified for legacy snapshots via release name, namespace, and version matching (done by the caller before VerifyReleaseObject)
  4. Out-of-sync detection still works - VerifyRelease checks chart name/version and config digest, which are stored separately and remain compatible
  5. Snapshot migration happens naturally when an actual change triggers an upgrade - the new snapshot will have APIVersion: "v2" and the new digest format

Migration Behavior

Scenario Behavior
Legacy snapshot + release in-sync No upgrade, digest verification skipped, snapshot stays legacy
Legacy snapshot + release out-of-sync Normal upgrade, snapshot updated with APIVersion: "v2"
New snapshot (v2) + release in-sync Full digest verification, no upgrade
New snapshot (v2) + release out-of-sync Full digest verification, normal upgrade

Legacy in-sync releases will continue to skip digest verification until they are upgraded for a legitimate reason (chart version change, values change, etc.). This is acceptable because:

  • Release ownership is still verified by name/namespace/version
  • Chart and config verification still works
  • No unnecessary upgrades are triggered

@stefanprodan
Copy link
Member

@matheuscscp when drift detection is enabled, will it trip over the APIVersion mismatch?

@matheuscscp
Copy link
Member Author

@matheuscscp when drift detection is enabled, will it trip over the APIVersion mismatch?

Apparently, no! Same behavior as no drift detection. 🟢

Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>
Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>
Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the SSA field to the API docs

Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for Helm v4

4 participants