We deploy our microservices in two distinct GKE clusters, one for testing, other for production.
Our workloads make use of workload identity. In "test environment" all works well, all workloads share the same Kubernetes service account that has been bound to a GCP service account.
In "production environment" the cluster is backed by three node pools (I include this info for completeness but I'm not sure it is important) and we have problems with workload identity.
In production environment, in some containers, if we use the shell to GET the metadata or we use gcloud, we get that unexpectedly the current user is the user associated with the node, not the one from workload identity. For other pods the workload identity works as expected instead.
Another potentially interesting thing is that only the pods that have been added lately through new Deployments seem to be affected by thin "misconfiguration".
I'm at a loss about how to investigate this issue. Do you have any idea?
Thx in advance.