feat: update v1beta2 by jschoone · Pull Request #255 · SovereignCloudStack/cluster-stacks

jschoone · 2026-02-18T08:43:38Z

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

squash commits
include documentation
add unit tests

Migrate openstack/scs2 and docker/scs2 cluster stacks to CAPI v1beta2: CAPI v1beta2 Migration: - ClusterClass, KubeadmControlPlaneTemplate, KubeadmConfigTemplate -> v1beta2 - Infrastructure resources (CAPO/CAPD) remain v1beta1 (providers not yet v1beta2) - ref -> templateRef, workers template: wrapper removed - extraArgs/kubeletExtraArgs converted from map to list of {name, value} - oidcConfig patches use extraArgs/- (list-append) pattern - apiServer: {} removed (fails minProperties:1, created dynamically by patches) Variable Defaults Consolidation: - All ClusterClass variable defaults moved to values.yaml - Templates reference {{ .Values.variables.* }} instead of hardcoded values - 20+ variables for openstack/scs2, 4 for docker/scs2 New Features: - Registry mirrors: registryMirrors array variable with containerd hosts.toml patches - OIDC authentication: Full oidcConfig variable + 6 apiServer extraArgs patches - certSANs: Extra Subject Alternative Names for API server cert - AfterClusterUpgrade hook: Third lifecycle stage in clusteraddon.yaml Security Hardening: - controller-manager: --profiling=false, --terminated-pod-gc-threshold=100 - scheduler: --profiling=false - etcd: metrics exposed, auto-compaction, tuned election/heartbeat - kube-proxy: metrics on 0.0.0.0:10249 (docker/scs2) docker/scs2 (new stack): - Complete new cluster stack for Docker provider with v1beta2 - Cilium 1.19.0 with Gateway API + SCTP support - metrics-server 3.13.0 - Multi-version build support (versions.yaml: 1.32-1.35) Addon Version Bumps (openstack/scs2): - Cilium 1.18.5 -> 1.19.0 - openstack-cloud-controller-manager 2.34.1 -> 2.34.2 - openstack-cinder-csi 2.34.1 -> 2.34.3 - versions.yaml: key renames (occm/cinder_csi -> full chart names), ubuntu field Documentation: - Rewritten overview.md, configuration.md - New quickstart.mdx (Docusaurus Tabs for provider selection) - New versioning.md, build-system.md - Removed outdated kamaji.md Legacy stacks (minimal): - openstack/scs: ubuntu field added to versions.yaml - docker/scs: imageRepository variable in values.yaml, versions.yaml created Assisted-by: Claude Code Signed-off-by: Jan Schoone <jan.schoone@uhurutec.com>

garloff

Well, this is scs3 then, because we change all the variable names again.
"Old scs2" and "new scs2" is just trying to maximize customer confusion.
I honestly would hate this as customer.

Question:
Can we keep variable names etc. stable with going to v1beta2 or do we need to change anyways? In the latter case, I'd say let's call it scs3.
Otherwise let's avoid another round of variable renames.
There is really only one variable missing from scs2 in my practical usage of the stack: serviceLoadBalancer. But adding things is not a problem (as long as the defaults are in line with old behavior).

garloff · 2026-02-19T11:42:25Z

docs/providers/openstack/configuration.md

+    variables:
+      - name: flavor
+        value: "SCS-2V-4-20s"
+      - name: rootDisk


Is this the rootDisk for the workers? That would make sense then: SCS-4V-8 plus a 50GB cinder disk.
If it is for the controlPlane, then the example is a very bad idea: We have the flavor SCS-2V-4-20s which has a fast local SSD/NVMe which costs a bit more than a 20GB cinder disk (in most clouds), but has the big advantage of low write latency and thus a stable etcd cluster. If you now attach a 50GB cinder root disk, you get the worst of both worlds: You pay for a local SSD/NVMe which you can't even see and your etcd is not stable :-(
We had the variables controlPlaneFlavor, controlPlaneRootDisk and workerFlavor and workerRootDisk in scs2 clusterClass. I would expect to see these variables here ...

It's for all machines and can/should be configured for different control planes and machine deployments using variables.override.
This allows us to have the same variables for clusters with a hosted control plane where controlPlane* variables make no sense

garloff · 2026-02-19T11:45:13Z

docs/providers/openstack/configuration.md

+
+| scs (old) | scs2 (new) | Notes |
+|-----------|------------|-------|
+| `controller_flavor` | `flavor` | Unified; override per CP or worker via topology overrides |


OK, so we need scs3 if we change the variable names again.
Why oh why?
Also, I consider it a regression.
I regularly chose SCS-2V-4-20s flavors for the control plane, (no rootDisk) and a flavor without root disk (and potentially more vCPUs/RAM) for workers.

This allows us to have the same variables for clusters with a hosted control plane where controlPlane* variables make no sense.
Variables can be overridden per machineDeployment and control plane

garloff · 2026-02-19T11:59:21Z

Sorry for my rant, but changing the clusterClass variables all the time looks like a very confused way of doing product management for cluster stacks. Communicating to users that you don't care about their pain for handling spurious changes all the time. Trying to get a Lennart-medal? Hiding it behind "old scs2" and "new scs2" adds insult to injury.
If the conversion to v1beta2 makes this hard to avoid, then let's use the opportunity to design a stable setup for the variables that we hopefully can keep stable for more than half a year and call this scs3. I would also object to merging flavor and rootDisk for workers and controlPlane. They have very different requirements.

jschoone · 2026-02-19T12:10:46Z

Sorry for my rant, but changing the clusterClass variables all the time looks like a very confused way of doing product management for cluster stacks. Communicating to users that you don't care about their pain for handling spurious changes all the time. Trying to get a Lennart-medal? Hiding it behind "old scs2" and "new scs2" adds insult to injury. If the conversion to v1beta2 makes this hard to avoid, then let's use the opportunity to design a stable setup for the variables that we hopefully can keep stable for more than half a year and call this scs3. I would also object to merging flavor and rootDisk for workers and controlPlane. They have very different requirements.

This must be a new Cluster Stack anyway as we said for v1beta2 and the changes we want to implement for the Workload Clusters default loadbalancer configuration.
Sorry for not marking it as Draft, I did it now.

garloff · 2026-02-19T12:12:39Z

OK, good, so let's work on a good scs3 clusterStack with v1beta2 then.

jschoone · 2026-02-19T12:25:02Z

There were just many discussions on the configuration at the hackathon which are not yet visible in this PR.
We already reviewed it together and it will change even more.
To avoid confusion I will find a working title for the new Cluster Stack.

Well, this is scs3 then, because we change all the variable names again. "Old scs2" and "new scs2" is just trying to maximize customer confusion. I honestly would hate this as customer.

Question: Can we keep variable names etc. stable with going to v1beta2 or do we need to change anyways? In the latter case, I'd say let's call it scs3. Otherwise let's avoid another round of variable renames. There is really only one variable missing from scs2 in my practical usage of the stack: serviceLoadBalancer. But adding things is not a problem (as long as the defaults are in line with old behavior).

Basically adding variables breaks the idea of Cluster Stacks the same since the description does not match anymore, but can be mentioned in the docs, with "available from..." or something

Nils98Ar · 2026-02-19T14:26:06Z

@jschoone @garloff @janiskemper

Side note: The cso validating webhook does not allow changing the ClusterStack of an existing cluster (e.g. scs -> scs2) and therefore needs to be disabled for cluster resources during the upgrade.

janiskemper · 2026-02-19T14:56:25Z

In general, my very strong opinion is that if we want people to upgrade their clusters from one version of a cluster stack to another, then it must be the same cluster stack.

I don't see any reason in creating different cluster stacks, if we want people to upgrade their clusters from one to another.

The only reason for me to create a new cluster stack, is if we want to maintain them in parallel because they both have valid, but different configuration.

If we introduce breaking changes in the configuration, then we need to think about upgrade paths (if we want people to upgrade) independently of the name of the cluster stack.

Signed-off-by: Nils Arnold <arnold@aov.de>

Nils98Ar · 2026-02-19T15:51:17Z

@janiskemper Would it be possible in general to use SemVer for ClusterStacks, for example, to distinguish between patch, minor, and major updates with breaking changes?

janiskemper · 2026-02-19T19:02:40Z

not right now. But I don't think it is necessary. How we do it at Syself is that we use the Kubernetes minor versions to introduce some new stuff. Within one Kubernetes minor version we stick to small changes and patches, while with the new minor versions we introduce more things.

We haven't actually had breaking changes so far, but if they are necessary, and people should still be able to upgrade their clusters, I suggest to introduce these breaking changes with a new Kubernetes minor version.

In this way, there is no need for semver versioning. This is in fact the reason why we didn't introduce it, because the versioning based on the Kubernetes version already gives you a new "major" version every few months (based on e.g. "openstack-scs-1-34-v1"

jschoone · 2026-02-20T06:55:49Z

not right now. But I don't think it is necessary. How we do it at Syself is that we use the Kubernetes minor versions to introduce some new stuff. Within one Kubernetes minor version we stick to small changes and patches, while with the new minor versions we introduce more things.

We haven't actually had breaking changes so far, but if they are necessary, and people should still be able to upgrade their clusters, I suggest to introduce these breaking changes with a new Kubernetes minor version.

In this way, there is no need for semver versioning. This is in fact the reason why we didn't introduce it, because the versioning based on the Kubernetes version already gives you a new "major" version every few months (based on e.g. "openstack-scs-1-34-v1"

Ah, so if we follow that pattern we need the one dir per Kubernetes version or at least per all Kubernetes versions with the same features. Could work, probably not more or less messy than now :D
Let's see how this could look like

janiskemper · 2026-02-20T09:10:39Z

we do in fact have on directory per Kubernetes minor version!

jschoone · 2026-02-20T12:24:38Z

we do in fact have on directory per Kubernetes minor version!

Yes I know, in this repo we replaced it because of code repetition, because we always had the idea of dealing with breaking changes in different cluster stacks not their minor versions, where the subdirectories make much more sense now. I think that can really solve the problems we now have, even it still looks wrong to have the version directories in a git repo :D
I'll add a possible structure soon

jschoone changed the base branch from main to chore/cs-cleanup February 18, 2026 08:44

jschoone force-pushed the feat/scs2-v1beta2 branch from 6423e67 to d3e5f05 Compare February 18, 2026 09:27

jschoone force-pushed the chore/cs-cleanup branch from a8e4cac to 9ff91b7 Compare February 18, 2026 09:28

jschoone changed the title ~~refactor(build): Replace build tooling with simplified script-based a…~~ feat: update v1beta2 Feb 18, 2026

garloff reviewed Feb 19, 2026

View reviewed changes

jschoone marked this pull request as draft February 19, 2026 12:10

Make kubernetes metrics exporters usable. (#257)

dc263ba

Signed-off-by: Nils Arnold <arnold@aov.de>

This was referenced Feb 19, 2026

✨ Add registry_mirrors feature #197

Closed

Apply Add-Ons again after cluster upgrade #208

Closed

Add restrict_kubeapi flag #110

Draft

This was referenced Feb 22, 2026

Feat/new releases 202602 #258

Open

Feat/scs minor versions #259

Draft

Conversation

jschoone commented Feb 18, 2026

Uh oh!

garloff left a comment

Choose a reason for hiding this comment

Uh oh!

garloff Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

jschoone Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

garloff Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

jschoone Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

garloff commented Feb 19, 2026

Uh oh!

jschoone commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

garloff commented Feb 19, 2026

Uh oh!

jschoone commented Feb 19, 2026

Uh oh!

Nils98Ar commented Feb 19, 2026

Uh oh!

janiskemper commented Feb 19, 2026

Uh oh!

Nils98Ar commented Feb 19, 2026

Uh oh!

janiskemper commented Feb 19, 2026

Uh oh!

jschoone commented Feb 20, 2026

Uh oh!

janiskemper commented Feb 20, 2026

Uh oh!

jschoone commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jschoone commented Feb 19, 2026 •

edited

Loading

jschoone commented Feb 20, 2026 •

edited

Loading