Skip to content

Conversation

afarbos
Copy link
Contributor

@afarbos afarbos commented Dec 17, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

Following #1366, the goal of this PR is to expose the server used by identity enabling easier authentication without prior cluster access or secret manipulation and update of the identity service config.

Doc: https://cloud.google.com/kubernetes-engine/docs/how-to/oidc

example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: GCPManagedControlPlane
metadata:
  creationTimestamp: "2024-12-17T21:31:23Z"
  finalizers:
  - gcpmanagedcontrolplane.infrastructure.cluster.x-k8s.io
  generation: 2
  labels:
    cluster.x-k8s.io/cluster-name: foo
  name: foo
  namespace: bar
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Cluster
    name: foo
    uid: ce97e489-f1e6-4fbb-bcb9-a34f98d1eab7
  resourceVersion: "16165"
  uid: 7ceb8c04-ab91-4b3c-967a-adf3c9e9be30
spec:
  clusterName: foo
  controlPlaneVersion: 1.30.5
  enableIdentityService: true
  endpoint:
    host: 108.59.84.44
    port: 443
  location: us-central1
  project: "123456"
status:
  conditions:
  - lastTransitionTime: "2024-12-17T22:08:04Z"
    status: "True"
    type: Ready
  - lastTransitionTime: "2024-12-17T22:08:04Z"
    reason: GKEControlPlaneCreated
    severity: Info
    status: "False"
    type: GKEControlPlaneCreating
  - lastTransitionTime: "2024-12-17T22:08:04Z"
    status: "True"
    type: GKEControlPlaneReady
  - lastTransitionTime: "2024-12-17T22:27:36Z"
    reason: GKEControlPlaneUpdated
    severity: Info
    status: "False"
    type: GKEControlPlaneUpdating
  currentVersion: 1.30.5
  identityServiceServer: https://34.134.50.254:443 # <- NEW FIELD HERE
  initialized: true
  ready: true

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

GKEManagedControlPlane: Add support for identity service server in status and updating identity service 

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Dec 17, 2024
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 17, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @afarbos. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 17, 2024
Copy link

netlify bot commented Dec 17, 2024

Deploy Preview for kubernetes-sigs-cluster-api-gcp ready!

Name Link
🔨 Latest commit c8c780a
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-cluster-api-gcp/deploys/68a89eea4e1ea20008371556
😎 Deploy Preview https://deploy-preview-1385--kubernetes-sigs-cluster-api-gcp.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@afarbos
Copy link
Contributor Author

afarbos commented Dec 17, 2024

I ran locally: make generate, make verify and make test all pass.
I also verified everything works using tilt.

@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from 97b4416 to 6bda6de Compare December 17, 2024 23:26
@salasberryfin
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 19, 2024
@salasberryfin
Copy link
Contributor

Thanks @afarbos. Could you please rebase your commits to squash them into logical changes? This helps maintain a clean history and simplifies reverts, if needed.

@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from 6bda6de to 3cbacd9 Compare December 19, 2024 16:48
@afarbos
Copy link
Contributor Author

afarbos commented Dec 19, 2024

Thanks @afarbos. Could you please rebase your commits to squash them into logical changes? This helps maintain a clean history and simplifies reverts, if needed.

sounds good, done!

3 commits:

  • fix local dev
  • fix add update logic for identity server
  • feat expose identity service server

@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from 3cbacd9 to 60fdfc2 Compare December 19, 2024 16:55
} else if updateErr := s.updateCAPIKubeconfigSecret(ctx, configSecret); updateErr != nil {
return fmt.Errorf("updating kubeconfig secret: %w", err)
} else if kubeConfig, err = s.updateCAPIKubeconfigSecret(ctx, configSecret); err != nil {
return nil, fmt.Errorf("updating kubeconfig secret: %w", err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this also fix the error it should have been updateErr, now everything is err

@afarbos afarbos requested a review from salasberryfin January 15, 2025 22:12
@afarbos
Copy link
Contributor Author

afarbos commented Mar 4, 2025

@salasberryfin friendly bump

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 5, 2025
@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from 60fdfc2 to a3b41e7 Compare March 6, 2025 19:48
@afarbos
Copy link
Contributor Author

afarbos commented Mar 6, 2025

PR needs rebase.

rebased

@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from a3b41e7 to 3bad5b6 Compare March 6, 2025 19:52
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 6, 2025
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 7, 2025
@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from 6dab332 to f702527 Compare April 7, 2025 18:32
@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from f702527 to f1f8dc7 Compare April 7, 2025 18:34
@afarbos
Copy link
Contributor Author

afarbos commented Apr 7, 2025

rebased

@salasberryfin
Copy link
Contributor

Hey @afarbos, thanks for the contributions (and the patience) and sorry for the delay in responding. The only issue I see with this is that we're not actively testing managed cluster functionality so, would it be okay if we wait for this kubernetes/k8s.io#7665 request to be resolved so we can actually test that GKE provisioning is functioning properly?

@afarbos
Copy link
Contributor Author

afarbos commented May 8, 2025

Hey @afarbos, thanks for the contributions (and the patience) and sorry for the delay in responding. The only issue I see with this is that we're not actively testing managed cluster functionality so, would it be okay if we wait for this kubernetes/k8s.io#7665 request to be resolved so we can actually test that GKE provisioning is functioning properly?

Yes, this was started from my other issue #1371 😭
I have been tracking it hoping to see it done. Sounds good 🤞 for soon*ish.

@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from f1f8dc7 to 6cf8e76 Compare June 25, 2025 15:21
@afarbos
Copy link
Contributor Author

afarbos commented Jun 27, 2025

@salasberryfin I think this is now resolved, thank to you! Thank you!
If you can take another look and let me know what is missing.

@afarbos
Copy link
Contributor Author

afarbos commented Jun 30, 2025

/test pull-cluster-api-provider-gcp-test

@salasberryfin
Copy link
Contributor

You will need to rebase this to fix the gcp-apidiff.

@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from 6cf8e76 to e03c99a Compare July 31, 2025 18:20
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: afarbos
Once this PR has been reviewed and has the lgtm label, please assign salasberryfin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@afarbos
Copy link
Contributor Author

afarbos commented Jul 31, 2025

You will need to rebase this to fix the gcp-apidiff.

ack, rebased. 👀

@afarbos afarbos force-pushed the af/RetrieveIDServiceServer branch from e03c99a to c8c780a Compare August 22, 2025 16:46
@afarbos
Copy link
Contributor Author

afarbos commented Aug 22, 2025

rebased again, make verify && make test still passes

@@ -168,6 +173,11 @@ func (s *Service) Reconcile(ctx context.Context) (ctrl.Result, error) {
return ctrl.Result{}, err
}

err = s.reconcileIdentityService(ctx, kubeConfig, &log)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I always recommend the one-line form when we don't need the error later:

if err := s.reconcileIdentityService(..); err != nil {

(No need to fix, just my 2c)

desiredEnableIdentityService := s.scope.GCPManagedControlPlane.Spec.EnableIdentityService
if desiredEnableIdentityService != existingCluster.GetIdentityServiceConfig().GetEnabled() {
needUpdate = true
clusterUpdate.DesiredIdentityServiceConfig = &containerpb.IdentityServiceConfig{Enabled: desiredEnableIdentityService}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I believe that some (OK, most) fields cannot be updated in "one shot". https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1/ClusterUpdate says "Exactly one update can be applied to a cluster with each request, so at most one field can be provided."

I think the easiest way to handle this is probably to build the UpdateClusterRequest as we are doing here, but then to break it down into one-field-at-a-time requests when we actually go to call UpdateCluster

(I don't know whether we want to handle in this PR - or maybe it is handled somewhere else and I missed it - but it is a classic gotcha that I'm sure we'll hit!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting limit I did not know about. We're actually updating the UpdateClusterRequest with all of the changes we detect and then calling UpdateCluster(). Should we be having issues with GCP rejecting multiple changes at the same time? I don't recall seeing this. Does it mean that only one of the updated fields is applied and an update of multiple fields needs as many reconciliations?

identityServiceServer, err := s.getIdentityServiceServer(ctx, kubeConfig)
if err != nil {
err = fmt.Errorf("failed to retrieve identity service: %w", err)
log.Error(err, "Failed to retrieve identity service server")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I personally think we shouldn't do this, we should rely on the caller logging, but I'm guessing this happens more often than we would like

(Another thought, likely not for this PR - we should decide whether we should pass the logr in, vs getting it from the ctx)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second this, it is cleaner to have the caller logging the error and the called method only return the error here.

I also agree on the logger being passed as an argument, which we're doing already, and I think is overly convoluted, but I suggest we discuss this in a separate issue and open a stand-alone PR to tidy it up.

@justinsb
Copy link
Contributor

Some comments but nothing blocking IMO

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants