Compatibility issue with using a cloud provider and kubelet-csr-approver (any helm app) #12059
Labels
kind/bug
Categorizes issue or PR as related to a bug.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Hi, (creating this issue on the back off another issue #11842 )
Kubespray Version - v2.24.0
Python - 3.9.0
Ansible core - 2.15.13
Key vars:
I've run into a similar issue with kubepsray whilst using vSphere + Hardening.
I've done some digging and it seems to be a chicken and egg scenario whereby the noschedule taints are there because of kubelet's startup args. When you set cloud_provider: external , you're telling kubespray (and ultimately kubernetes) to configure the node kubelet.service to taint the node. Whilst it is generally ok to do this to the control planes, doing this on the nodes before the cloud control manager is deployed leaves the nodes in a tainted state. Hence any workloads you deploy, IE kubelet-csr-approver in this case will fail!
If you check the errors from the kubelet-csr-approver deployment, you'll most likely see something along the lines off:
Note that the node roles occur way before the kubelet-csr-approver role. You can check the order here.
This has been further discussed here.
You can also see the cloud manager docs about this flag here.
As a workaround you can run the cluster.yaml twice, on the first run skip the tag for kubelet-csr-approver and on the second run, only run it for that tag.
Something like this might work:
Wondering if anyone else has any other workarounds or a permanent solution.
The text was updated successfully, but these errors were encountered: