-
Notifications
You must be signed in to change notification settings - Fork 0
Automate production of CAPI VM images #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! 🎉 This doesn't actually push or publish the artifact anywhere right?
The openstack part now, azure does, since it creates a VM on there and all (it's the default behaviour). |
HaoruiPeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
But I have a question concerning Openstack.
What would the the follow-up steps after the image is created? The workflow stops here, I assume that means the image is gone when the runner is destroyed if you don't upload it anywhere.
Does it make sense to upload the image to the dev projects of all our Openstack infra providers? So we can share the images to production projects from there during maintenance.
Another idea is to have a self-hosted runner that does not get destroyed after the actions finish.
We will need credentials to upload and share the openStack images, that part is on hold until a decision is made about it. |
|
Apologies for not being able to look at this sooner. Do you think we could break the jobs up so that building is one job and publishing is another? The logs are quite long so that would make it easier to follow the process. Also, I think it would add some flexibility. |
That can be done for openStack. Once we decide on how to handle the credentials. |
Would it make sense to use qemu or similar to build the Azure VHD locally and then publish it in a separate job? As I understand it that is how the Openstack flow works, so if we can make it work for Azure too they would harmonize nicely. |
I don't know actually, but it feels like adding more work to us. |
It might be worth the effort actually. I see several benefits:
Imagine we build the images here and push them to object storage or similar. Then, pushing and publishing them can be tied to some other process, like creating a capi release. We could even keep the step manual if we want to. |
I like this, and we could add it as artifacts of the action. Then some other job could just download the images as the output of this action and push it wherever it needs |
| sudo udevadm control --reload-rules | ||
| sudo udevadm trigger --name-match=kvm | ||
| - name: install qemu-kvn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT
| - name: install qemu-kvn | |
| - name: install qemu-kvm |
| path: | | ||
| ~/.config/packer/plugins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Any reason this is multiline? Wouldn't this work?
| path: | | |
| ~/.config/packer/plugins | |
| path: ~/.config/packer/plugins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember exactly, should be changed. Will look at it.
| run: | | ||
| sudo apt update && \ | ||
| sudo apt upgrade -y && \ | ||
| sudo apt install -y qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils qemu-utils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Could we not build an image that has these things already and use that? Would save some time and be less error-prone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will check that.
8c019db to
c791d3a
Compare
simonklb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase and request a re-review! 😄
ce5b032 to
caa78fa
Compare
caa78fa to
2dc7ef9
Compare
simonklb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rebase got some wonky changes. Both a failed conflict resolution and some extra whitespace issues. Please sort them out and re-request a review!
update workflow add default value to workflow inputs update workflow testing default values test run update workflow update pin ansible version ansible ver ansible ver ansible-core ver update workflow update remove custom role (temporary) add azure build step add SP login update azure envs fix typo add cache add key testing azure add gh token fix cache update workflow seperate jobs update update logs remove cahce from azure test update update fix typo add artifact upload update store step update path add store workflow add input add install openstckclient fix fix command add image-builder workflow fix branch name testing sed typo create tag quotes echo fix add docker login update openstack to use container remove checkout change workflow update .dockerignore update workflow add option testing binbash hostname test test try deps testing testing typo test enable kvm add logs env TEST test mount mount change mount change kvm testing upload artifact update mount rw add user mdkir privileged test testing enable azure add elastx store update storing inherit secrets fix naming add safespring store add safespring store change auth safespring verbose change openstack final add sshca role testing enable image builder update builder add sshca role build new image run build add volume testing add docker image add envs final1
2dc7ef9 to
ba6d047
Compare
Should be good now. |
simonklb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth inviting to a walkthrough meeting for this. There are a lot of moving parts. 😅
At least update the PR description explaining how all of this should work and how the flow looks like!
Great job doing all of this! 👍
| env: | ||
| version: ${{ inputs.version }} | ||
| tag: ${{ inputs.tag }} | ||
| docker_image: "ghcr.io/elastisys/image-builder-amd64:Automate-production-of-CAPI-VM-images-09c9dac9dc61dc069b72ac55e654cbe1a9190911" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this image never rebuilt or can we really have it hardcoded like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not, the plan is to make as a variable so we can pick up the last built one from main, as what the workflow for will do on push.
I just didn't get to that yet.
| sed -r \ | ||
| -e "s/\\\$KUBERNETES_SERIES/${series}/" \ | ||
| -e "s/\\\$KUBERNETES_VERSION/${version}/" \ | ||
| -e "s/\\\$KUBERNETES_DEB_VERSION/${package}/" \ | ||
| -e "s/\\\$IMAGE_TAG/${tag}/" \ | ||
| <"template.json" >"kubernetes.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is not envsubst available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just went into copying what we had done in other places.
| docker run -i --rm \ | ||
| -e PACKER_VAR_FILES -e PACKER_GITHUB_API_TOKEN=${{ secrets.GITHUB_TOKEN }} \ | ||
| -e SIG_IMAGE_DEFINITION -e SIG_PUBLISHER -e SIG_OFFER -e SIG_SKU \ | ||
| -e AZURE_SUBSCRIPTION_ID -e AZURE_CLIENT_ID -e AZURE_CLIENT_SECRET -e AZURE_TENANT_ID -e AZURE_LOCATION \ | ||
| -e RESOURCE_GROUP_NAME -e GALLERY_NAME -e BUILD_RESOURCE_GROUP_NAME \ | ||
| -v ${{ github.workspace }}/images/capi:/tmp/host \ | ||
| ${{ env.docker_image }} build-azure-sig-ubuntu-2404-gen2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks odd, why do you have to start the container yourself instead of running it as a workflow job task?
Edit: Found multiple manual docker run executions that I don't understand why they couldn't be normal tasks. Please enlighten me! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is this option where to run the job inside a container by default.
I tried it multiple times but i couldn't figure out why it did not work. So i opted out for the surefire method.
| # store-openstack-image-safespring: | ||
| # uses: ./.github/workflows/store-openstack-capi-image-safespring.yml | ||
| # needs: build-openstack-image | ||
| # with: | ||
| # version: ${{ inputs.version || '1.33.1' }} | ||
| # tag: ${{ inputs.tag || '0.8' }} | ||
| # secrets: inherit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup?
| on: | ||
| push: | ||
| branches: | ||
| - main | ||
| # pull_request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup?
| - name: get tag | ||
| id: get-tag | ||
| run: | | ||
| if [ "${{ github.event_name }}" == "pull_request" ]; then | ||
| PR_TITLE="${{ github.event.pull_request.title }}" | ||
| PR_TAG=$(echo "${PR_TITLE}" | sed -e 's/ /-/g') | ||
| echo "TAG=${PR_TAG}-${{ github.sha }}" >> $GITHUB_OUTPUT | ||
| else | ||
| echo "TAG=${GITHUB_REF##*/}-${{ github.sha }}" >> $GITHUB_OUTPUT | ||
| fi | ||
| shell: bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels brittle. This assumes that everone will be good and name their PR correctly and we never change that policy.
Is it not possible to only trigger this on tags being pushed? Then you can also just rely on ${{ github.ref_name }}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That part actually should be removed, as the image is not intended to be build from anything but main.
The file will be cleaned up accordingly.
| username: ${{github.actor}} | ||
| password: ${{secrets.GITHUB_TOKEN}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
| username: ${{github.actor}} | |
| password: ${{secrets.GITHUB_TOKEN}} | |
| username: ${{ github.actor }} | |
| password: ${{ secrets.GITHUB_TOKEN }} |
| sed -r \ | ||
| -e "s/\\\$KUBERNETES_SERIES/${series}/" \ | ||
| -e "s/\\\$KUBERNETES_VERSION/${version}/" \ | ||
| -e "s/\\\$KUBERNETES_DEB_VERSION/${package}/" \ | ||
| -e "s/\\\$IMAGE_TAG/${tag}/" \ | ||
| <"template.json" >"kubernetes.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use envsubst here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is needed it seems to ssh capabilty, copied it form https://github.com/elastisys/ck8s-cluster-api/blob/main/scripts/image/roles/sshca/README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for SSH CA & Certificate login. Not sure if that whole Ansible role step can be disabled when not needed 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man it would be cool if we ever started to use that instead of had to push public keys everywhere.
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused. The store jobs actually pushes the images straight to the cloud providers right? When we had the discussion meeting, didn't we agree that the images should be built here and stored as artifacts, and handle the distribution/consumption separately for now? Here are the meeting notes for reference: https://docs.google.com/document/d/18inrhGuT2yyaHINowxFbBGj7tYSwafYC2p1vrPy-i3Y/edit?tab=t.736zxsrcd694
Great work on the massive effort you put into this thing so far, and maybe I'm misunderstanding something but I think we need to sort this out before moving forward.
That is true, i will update it accordingly, got side tracked by other things and forgot about it. |
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it is work in progress I'm cool with adding this to main for testing purposes, as long as we keep upstream files unaltered. Regardless, I'm really stoked about this! 😄
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I stated before, I think this is ready to be merged for testing purposes so I will approve. We can always fix any unforseen problems afterwards. Nice work! 🚀
| build-azure-image: | ||
| uses: ./.github/workflows/build-azure-capi-image.yml | ||
| with: | ||
| version: ${{ inputs.version || '1.33.1' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Do we need the defaults in 2 places? Same on the tag etc.
| version: ${{ inputs.version || '1.33.1' }} | |
| version: ${{ inputs.version }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i remember correctly, yes, as the default was not always picked up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that feels odd, it like the inputs section would be enough. Could it be that it's marked as required and has a default value? Not sure why that should mess with the defaults, but that's the only thing that I can see as a bit inconsistent
viktor-f
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that this looks good enough to be tested.
Really nice work with this, it will be very nice to have these images build automatically!
Change description
Related issues
Additional context