Conversation
Migrate openstack/scs2 and docker/scs2 cluster stacks to CAPI v1beta2:
CAPI v1beta2 Migration:
- ClusterClass, KubeadmControlPlaneTemplate, KubeadmConfigTemplate -> v1beta2
- Infrastructure resources (CAPO/CAPD) remain v1beta1 (providers not yet v1beta2)
- ref -> templateRef, workers template: wrapper removed
- extraArgs/kubeletExtraArgs converted from map to list of {name, value}
- oidcConfig patches use extraArgs/- (list-append) pattern
- apiServer: {} removed (fails minProperties:1, created dynamically by patches)
Variable Defaults Consolidation:
- All ClusterClass variable defaults moved to values.yaml
- Templates reference {{ .Values.variables.* }} instead of hardcoded values
- 20+ variables for openstack/scs2, 4 for docker/scs2
New Features:
- Registry mirrors: registryMirrors array variable with containerd hosts.toml patches
- OIDC authentication: Full oidcConfig variable + 6 apiServer extraArgs patches
- certSANs: Extra Subject Alternative Names for API server cert
- AfterClusterUpgrade hook: Third lifecycle stage in clusteraddon.yaml
Security Hardening:
- controller-manager: --profiling=false, --terminated-pod-gc-threshold=100
- scheduler: --profiling=false
- etcd: metrics exposed, auto-compaction, tuned election/heartbeat
- kube-proxy: metrics on 0.0.0.0:10249 (docker/scs2)
docker/scs2 (new stack):
- Complete new cluster stack for Docker provider with v1beta2
- Cilium 1.19.0 with Gateway API + SCTP support
- metrics-server 3.13.0
- Multi-version build support (versions.yaml: 1.32-1.35)
Addon Version Bumps (openstack/scs2):
- Cilium 1.18.5 -> 1.19.0
- openstack-cloud-controller-manager 2.34.1 -> 2.34.2
- openstack-cinder-csi 2.34.1 -> 2.34.3
- versions.yaml: key renames (occm/cinder_csi -> full chart names), ubuntu field
Documentation:
- Rewritten overview.md, configuration.md
- New quickstart.mdx (Docusaurus Tabs for provider selection)
- New versioning.md, build-system.md
- Removed outdated kamaji.md
Legacy stacks (minimal):
- openstack/scs: ubuntu field added to versions.yaml
- docker/scs: imageRepository variable in values.yaml, versions.yaml created
Assisted-by: Claude Code
Signed-off-by: Jan Schoone <jan.schoone@uhurutec.com>
6423e67 to
d3e5f05
Compare
a8e4cac to
9ff91b7
Compare
garloff
left a comment
There was a problem hiding this comment.
Well, this is scs3 then, because we change all the variable names again.
"Old scs2" and "new scs2" is just trying to maximize customer confusion.
I honestly would hate this as customer.
Question:
Can we keep variable names etc. stable with going to v1beta2 or do we need to change anyways? In the latter case, I'd say let's call it scs3.
Otherwise let's avoid another round of variable renames.
There is really only one variable missing from scs2 in my practical usage of the stack: serviceLoadBalancer. But adding things is not a problem (as long as the defaults are in line with old behavior).
| variables: | ||
| - name: flavor | ||
| value: "SCS-2V-4-20s" | ||
| - name: rootDisk |
There was a problem hiding this comment.
Is this the rootDisk for the workers? That would make sense then: SCS-4V-8 plus a 50GB cinder disk.
If it is for the controlPlane, then the example is a very bad idea: We have the flavor SCS-2V-4-20s which has a fast local SSD/NVMe which costs a bit more than a 20GB cinder disk (in most clouds), but has the big advantage of low write latency and thus a stable etcd cluster. If you now attach a 50GB cinder root disk, you get the worst of both worlds: You pay for a local SSD/NVMe which you can't even see and your etcd is not stable :-(
We had the variables controlPlaneFlavor, controlPlaneRootDisk and workerFlavor and workerRootDisk in scs2 clusterClass. I would expect to see these variables here ...
There was a problem hiding this comment.
It's for all machines and can/should be configured for different control planes and machine deployments using variables.override.
This allows us to have the same variables for clusters with a hosted control plane where controlPlane* variables make no sense
|
|
||
| | scs (old) | scs2 (new) | Notes | | ||
| |-----------|------------|-------| | ||
| | `controller_flavor` | `flavor` | Unified; override per CP or worker via topology overrides | |
There was a problem hiding this comment.
OK, so we need scs3 if we change the variable names again.
Why oh why?
Also, I consider it a regression.
I regularly chose SCS-2V-4-20s flavors for the control plane, (no rootDisk) and a flavor without root disk (and potentially more vCPUs/RAM) for workers.
There was a problem hiding this comment.
This allows us to have the same variables for clusters with a hosted control plane where controlPlane* variables make no sense.
Variables can be overridden per machineDeployment and control plane
|
Sorry for my rant, but changing the |
This must be a new Cluster Stack anyway as we said for v1beta2 and the changes we want to implement for the Workload Clusters default loadbalancer configuration. |
|
OK, good, so let's work on a good scs3 clusterStack with v1beta2 then. |
|
There were just many discussions on the configuration at the hackathon which are not yet visible in this PR.
Basically adding variables breaks the idea of Cluster Stacks the same since the description does not match anymore, but can be mentioned in the docs, with "available from..." or something |
|
@jschoone @garloff @janiskemper Side note: The cso validating webhook does not allow changing the ClusterStack of an existing cluster (e.g. scs -> scs2) and therefore needs to be disabled for cluster resources during the upgrade. |
|
In general, my very strong opinion is that if we want people to upgrade their clusters from one version of a cluster stack to another, then it must be the same cluster stack. I don't see any reason in creating different cluster stacks, if we want people to upgrade their clusters from one to another. The only reason for me to create a new cluster stack, is if we want to maintain them in parallel because they both have valid, but different configuration. If we introduce breaking changes in the configuration, then we need to think about upgrade paths (if we want people to upgrade) independently of the name of the cluster stack. |
Signed-off-by: Nils Arnold <arnold@aov.de>
|
@janiskemper Would it be possible in general to use SemVer for ClusterStacks, for example, to distinguish between patch, minor, and major updates with breaking changes? |
|
not right now. But I don't think it is necessary. How we do it at Syself is that we use the Kubernetes minor versions to introduce some new stuff. Within one Kubernetes minor version we stick to small changes and patches, while with the new minor versions we introduce more things. We haven't actually had breaking changes so far, but if they are necessary, and people should still be able to upgrade their clusters, I suggest to introduce these breaking changes with a new Kubernetes minor version. In this way, there is no need for semver versioning. This is in fact the reason why we didn't introduce it, because the versioning based on the Kubernetes version already gives you a new "major" version every few months (based on e.g. "openstack-scs-1-34-v1" |
Ah, so if we follow that pattern we need the one dir per Kubernetes version or at least per all Kubernetes versions with the same features. Could work, probably not more or less messy than now :D |
|
we do in fact have on directory per Kubernetes minor version! |
Yes I know, in this repo we replaced it because of code repetition, because we always had the idea of dealing with breaking changes in different cluster stacks not their minor versions, where the subdirectories make much more sense now. I think that can really solve the problems we now have, even it still looks wrong to have the version directories in a git repo :D |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs: