Skip to content

ElasticQuota evict pods failed. #2470

Open
@Xdydy

Description

@Xdydy

I want to test the pod evict on v1.6.0, and I install the koordinator with the config below

apiVersion: v1
kind: ConfigMap
metadata:
  name: koord-scheduler-config
  namespace: {{ .Values.installation.namespace }}
data:
  koord-scheduler-config: |
    apiVersion: kubescheduler.config.k8s.io/v1
    kind: KubeSchedulerConfiguration
    leaderElection:
      leaderElect: true
      resourceLock: leases
      resourceName: koord-scheduler
      resourceNamespace: {{ .Values.installation.namespace }}
    profiles:
      - pluginConfig:
        ...
        - name: ElasticQuota
          args:
            apiVersion: kubescheduler.config.k8s.io/v1
            kind: ElasticQuotaArgs
            quotaGroupNamespace: {{ .Values.installation.namespace }}
            enableCheckParentQuota: true # Here I set the True
            enableRuntimeQuota: true
            monitorAllQuotas: true
            revokePodInterval: 5s
            delayEvictTime: 5s

And I create 3 quotas below. Here I set the quota-child1's min resources to 0 means that there is no guaranteed for quota-child1. When one pod's quota is set to quota-child1, it can be killed at any time.

apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: quota-parent
  namespace: default
  labels:
    quota.scheduling.koordinator.sh/is-parent: "true"
spec:
  max:
    cpu: 500m
    memory: 40Mi
  min:
    cpu: 500m
    memory: 40Mi
---
apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: quota-child1
  namespace: default
  labels:
    quota.scheduling.koordinator.sh/is-parent: "false"
    quota.scheduling.koordinator.sh/parent: "quota-parent"
spec:
  max:
    cpu: 500m
    memory: 40Mi
  min:
    cpu: 0
    memory: 0
---
apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: quota-child2
  namespace: default
  annotations:
    quota.scheduling.koordinator.sh/evict-pods-exceed-min: "true" 
  labels:
    quota.scheduling.koordinator.sh/is-parent: "false"
    quota.scheduling.koordinator.sh/parent: "quota-parent"
spec:
  max:
    cpu: 500m
    memory: 40Mi
  min:
    cpu: 500m
    memory: 40Mi

And I first create the pod-example

apiVersion: v1
kind: Pod
metadata:
  name: pod-example
  namespace: default
  labels:
    quota.scheduling.koordinator.sh/name: "quota-child1"
spec:
  schedulerName: koord-scheduler
  containers:
  - command:
    - sleep
    - 5m
    image: busybox
    imagePullPolicy: IfNotPresent
    name: curlimage
    resources:
      limits:
        cpu: 40m
        memory: 30Mi
      requests:
        cpu: 40m
        memory: 30Mi
  restartPolicy: Never

After the pod-example's status is running, I create the pod2-example

apiVersion: v1
kind: Pod
metadata:
  name: pod2-example
  namespace: default
  labels:
    quota.scheduling.koordinator.sh/name: "quota-child2"
spec:
  schedulerName: koord-scheduler
  containers:
  - command:
    - sleep
    - 5m
    image: busybox
    imagePullPolicy: IfNotPresent
    name: curlimage
    resources:
      limits:
        cpu: 40m
        memory: 30Mi
      requests:
        cpu: 40m
        memory: 30Mi
  restartPolicy: Never

Since the memory sum of the two pods(60Mi) is greater than the max of parent-quota(50Mi), pod2-example is at status pending because of the code below.

if g.pluginArgs.EnableCheckParentQuota {

So I set the configmap and reinstall the koordinator

apiVersion: v1
kind: ConfigMap
metadata:
  name: koord-scheduler-config
  namespace: {{ .Values.installation.namespace }}
data:
  koord-scheduler-config: |
    apiVersion: kubescheduler.config.k8s.io/v1
    kind: KubeSchedulerConfiguration
    leaderElection:
      leaderElect: true
      resourceLock: leases
      resourceName: koord-scheduler
      resourceNamespace: {{ .Values.installation.namespace }}
    profiles:
      - pluginConfig:
        ...
        - name: ElasticQuota
          args:
            apiVersion: kubescheduler.config.k8s.io/v1
            kind: ElasticQuotaArgs
            quotaGroupNamespace: {{ .Values.installation.namespace }}
            enableCheckParentQuota: false # Here I set the false
            enableRuntimeQuota: true
            monitorAllQuotas: true
            revokePodInterval: 5s
            delayEvictTime: 5s

and repeat the steps above, it will lead to the scene that pod-example and the pod2-example are running, without evicting any of the pods.

And I see the code below and search the logs it reports that the parent-quota's used > runtime.

func (monitor *QuotaOverUsedGroupMonitor) monitor() bool {

However, it did not evict any pods because the pod is not reference for the parent-quota

func (gqm *GroupQuotaManager) updatePodCacheNoLock(quotaName string, pod *v1.Pod, isAdd bool) {
quotaInfo := gqm.getQuotaInfoByNameNoLock(quotaName)
if quotaInfo == nil {
return
}
if isAdd {
quotaInfo.addPodIfNotPresent(pod)
} else {
quotaInfo.removePodIfPresent(pod)
}
}
func (gqm *GroupQuotaManager) UpdatePodIsAssigned(quotaName string, pod *v1.Pod, isAssigned bool) error {
gqm.hierarchyUpdateLock.RLock()
defer gqm.hierarchyUpdateLock.RUnlock()
return gqm.updatePodIsAssignedNoLock(quotaName, pod, isAssigned)
}
func (gqm *GroupQuotaManager) updatePodIsAssignedNoLock(quotaName string, pod *v1.Pod, isAssigned bool) error {
quotaInfo := gqm.getQuotaInfoByNameNoLock(quotaName)
return quotaInfo.UpdatePodIsAssigned(pod, isAssigned)
}

Is it a bug? Or is there other code I ingore?

Originally posted by @Xdydy in #2342

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions