Skip to content

wal-g when setting WALG_AZ_PREFIX only substitutes $(SCOPE) and $(PGVERSION) #2227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
juwalter opened this issue Feb 17, 2023 · 7 comments · May be fixed by #2905
Open

wal-g when setting WALG_AZ_PREFIX only substitutes $(SCOPE) and $(PGVERSION) #2227

juwalter opened this issue Feb 17, 2023 · 7 comments · May be fixed by #2905

Comments

@juwalter
Copy link

Please, answer some short questions which should help us to understand your problem / question better?

  • Which image of the operator are you using? --> registry.opensource.zalan.do/acid/postgres-operator:v1.8.2
  • **Where do you run it - cloud or metal? --> Bare Metal K8s
  • Are you running Postgres Operator in production? --> yes
  • Type of issue? --> question

Hello,

when following the setup according to https://github.com/zalando/postgres-operator/blob/master/docs/administrator.md#azure-setup

apiVersion: v1
kind: ConfigMap
metadata:
  name: pod-env-overrides
  namespace: postgres-operator-system
data:
  # Any env variable used by spilo can be added
  USE_WALG_BACKUP: "true"
  USE_WALG_RESTORE: "true"
  CLONE_USE_WALG_RESTORE: "true"
  WALG_AZ_PREFIX: "azure://container-name/$(SCOPE)/$(PGVERSION)" # Enables Azure Backups (SCOPE = Cluster name) (PGVERSION = Postgres version)

this all really works fine thus far. However :) when we were using S3 previously, we got a better "path" name for the storage location of the WAL segments, like so:

  • template WALE_S3_PREFIX=$WAL_S3_BUCKET/spilo/{WAL_BUCKET_SCOPE_PREFIX}{SCOPE}{WAL_BUCKET_SCOPE_SUFFIX}/wal/{PGVERSION}
  • example: s3://postgresql/spilo/mycluster-postgres/6e6599d0-81e1-4a63-8b28-f96f59160096/wal/13

IMHO, this is a much better storage location as it does allow to simply identify from what particular cluster those WAL segments came from. I think it would also allow to have clusters with the same name in different k8s namespaces. Therefore, I set out to try to replicate this pattern using Azure blob, but no dice.

I found this WALE_S3_PREFIX seems to actually be created by spilo at https://github.com/zalando/spilo/blob/a86778bd601c4f6de98db9d207a8c1e6af31c984/postgres-appliance/scripts/configure_spilo.py#L892

    prefix_env_name = write_envdir_names[0]
    store_type = prefix_env_name[5:].split('_')[0]
    if not wale.get(prefix_env_name):  # WALE_*_PREFIX is not defined in the environment
        bucket_path = '/spilo/{WAL_BUCKET_SCOPE_PREFIX}{SCOPE}{WAL_BUCKET_SCOPE_SUFFIX}/wal/{PGVERSION}'.format(**wale)
        prefix_template = '{0}://{{WAL_{1}_BUCKET}}{2}'.format(store_type.lower(), store_type, bucket_path)
        wale[prefix_env_name] = prefix_template.format(**wale)
    # Set WALG_*_PREFIX for future compatibility
    if store_type in ('S3', 'GS') and not wale.get(write_envdir_names[1]):
        wale[write_envdir_names[1]] = wale[prefix_env_name]

What I did not find was how / where WALG_AZ_PREFIX: "azure://container-name/$(SCOPE)/$(PGVERSION)" is actually been interpolated. It does not appear to happen in spilo, nor in wal-g; that pretty much leaves the operator?

  • q1: can someone point me to where WALG_AZ_PREFIX is interpolated, so I can see if there is a chance to include e.g. $(WAL_BUCKET_SCOPE_SUFFIX)
  • q2: is it even a good idea to include the cluster uid (WAL_BUCKET_SCOPE_SUFFIX) in this path in case of Azure? both, s3 and gs talk about buckets, but I think this is not a thing in Azure blob

Many thanks in advance!

@mightymiracleman
Copy link

Hello, if I understand what you are after, I also had the need to include the cluster id in our backup paths (under azure);
( I wanted to make sure that if I needed to restore it the backups would go to a new location and have the backups "pristine" so I could attempt multiple restorations if need be).

I was able to accomplish this by including an initContainers section on the cluster that would query k8s for the postgresql cluster custom resource and store the uid in a configmap; then in the env section of the postgresql cluster I added a variable referencing the configmap / cluster id, finally, I overrode the WALG_AZ_PREFIX and referenced that newly added env var to get it into the path. I hope all this makes sense. (see below for more detail)

I was able to get this working by:

  • Adding some additional resource/verbs to the "postgres-pod" cluster role:
            - apiGroups:
             - acid.zalan.do
             resources:
             - postgresqls
             verbs:
             - get
           - apiGroups: [""]
             resources: ["configmaps"]
             verbs:
             - get
             - create
             - update
             - patch
  • Add an initContainers section to my postgresql cluster ( this initcontainer queries k8s to get your postgreql cr and retrieve / store the cluster's uid in a config map)
    initContainers:
    - args:
      - |
        pgc_id="$(kubectl get postgresql your-cluster-name -o jsonpath='{.metadata.uid}')";
        echo "{\"apiVersion\": \"v1\",\"kind\": \"ConfigMap\",\"metadata\": {\"name\": \"your-config-map-name\"}, \"data\": {\"pgc-id\": \"$pgc_id\"}}" > configmap.json;
        kubectl apply -f configmap.json;
      command:
      - /bin/sh
      - -c
      image: lachlanevenson/k8s-kubectl
      name: set-cluster-id-env

-Finally, I added an environment variable that is referenced as a dependent env var in the WALG_AZ_PREFIX (which I also include in the postgresql cluster to override it from the operator config):

    env:
      - name: PGC_ID
        valueFrom:
          configMapKeyRef:
            name: your-config-map-name
            key: pgc-id
      - name: WALG_AZ_PREFIX
        value: azure://your-blob-name/$(PGC_ID)/wal/$(PGVERSION)

@youvegotmoxie
Copy link

I was able to overcome this using a solution similar to @mightymiracleman

In the Postgresql manifest

  env:
  - name: PG_UID
    valueFrom:
      fieldRef:
        fieldPath: metadata.uid

In the operator Pod config

WALG_AZ_PREFIX: azure://backups/spilo/$(SCOPE)/$(PG_UID)/wal

@mightymiracleman
Copy link

I was able to overcome this using a solution similar to @mightymiracleman

In the Postgresql manifest

  env:
  - name: PG_UID
    valueFrom:
      fieldRef:
        fieldPath: metadata.uid

In the operator Pod config

WALG_AZ_PREFIX: azure://backups/spilo/$(SCOPE)/$(PG_UID)/wal

I did give this a try as it is a lot cleaner than the way I was doing it; but I think this is pulling the uid from the POD and not the postgresql cluster. I added it to one of my clusters and was surprised to see different values for the 2 env vars I had; the first one was my postgresql cluster uid, the second one I tracked down to the pod that the spilo container was running in; I also bounced the pod and noticed the value changed (and matched up with the pod) did I miss something on that?

@youvegotmoxie
Copy link

youvegotmoxie commented May 22, 2024

You are correct. I observed the same after a bit more testing. DownwardAPI doesn't support getting the UID from the StatefulSet or the Postgresql objects so I think your solution is the only way this will work.

@mightymiracleman
Copy link

No worries @youvegotmoxie ! I would have loved to do it the way you posted; it was a heck of a lot cleaner.

@siegenthalerroger
Copy link

The intended capability of the operator (using the k8s dependent env-vars) doesn't work as we want to use it here, as the env list is ordered. The WAL_BUCKET_SCOPE_PREFIX and _SUFFIX envs are added as the last items and as such utilising them in the WALG_AZ_PREFIX doesn't work.

I'm trying to understand the logic in the code surrounding the ordering of the variables to see if I could create a PR for this.

@mightymiracleman
Copy link

Not sure what you are asking here (sorry); I added an init container to get the postgresql cluster id so I could set it in a config map (and reference it using the dependent env-var you mentioned).

I had to do this as we were using AKS / Azure blob storage; I don't think you have to do this with AWS. Apologies if this isn't what you were after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants