derekwaynecarr

derekwaynecarr

Member Since 8 years ago

Red Hat, Raleigh, NC

Experience Points
242
follower
Lessons Completed
2
follow
Lessons Completed
11
stars
Best Reply Awards
115
repos

51 contributions in the last year

Pinned
⚡ tools for managing hugepages in kubernetes
⚡ Autoscaling components for Kubernetes
⚡ Public open source repository for the OpenShift Origin server components
⚡ Public open source repository for the OpenShift client tools and the 'rhc' gem.
⚡ OpenShift Development Tools
⚡ A client and daemon for installing and linking Docker containers into systemd across hosts
Activity
Apr
20
1 month ago
Activity icon
issue

derekwaynecarr issue comment kubernetes/test-infra

derekwaynecarr
derekwaynecarr

sig-node: add endocrimes as test-infra approver

sponsored by: @ehashman

I'm a heavy reviewer of test-infra and e2e_node changes and member of the CI testing subgroup. Formal reviewer since 18 Nov 2021.

Explicitly Reviewed PRs

/cc @ehashman @derekwaynecarr

derekwaynecarr
derekwaynecarr

thanks for all the effort @endocrimes in the sig!

Activity icon
issue

derekwaynecarr issue comment kubernetes/test-infra

derekwaynecarr
derekwaynecarr

sig-node: add endocrimes as test-infra approver

sponsored by: @ehashman

I'm a heavy reviewer of test-infra and e2e_node changes and member of the CI testing subgroup. Formal reviewer since 18 Nov 2021.

Explicitly Reviewed PRs

/cc @ehashman @derekwaynecarr

Apr
12
1 month ago
Activity icon
issue

derekwaynecarr issue comment kubernetes/website

derekwaynecarr
derekwaynecarr

Add info about finding the runtime endpoint

Fixes #30974

Add a section to "Find out what container runtime is used on a node" to find the runtime endpoint, with a clarification that Docker users who use cri-dockerd already wouldn't be affected by the dockershim removal.

/sig docs /language en /cc @sftim @afbjorklund

derekwaynecarr
derekwaynecarr

this looks fine for sig-node.

/lgtm

Apr
8
1 month ago
Activity icon
issue

derekwaynecarr issue comment kubernetes/website

derekwaynecarr
derekwaynecarr

Visually Document Container Memory Metrics and their Relationships

This is a Feature Request

What would you like to be added A document that explains what all the different container memory metrics mean and how they are interrelated.

Why is this needed Today, the following metrics exist for container memory:

  • container_memory_cache
  • container_memory_mapped_file
  • container_memory_max_usage_bytes
  • container_memory_rss
  • container_memory_swap
  • container_memory_usage_bytes
  • container_memory_working_set_bytes

I would like to see a document that explains what they are, how they are different or similar to each other, how they nest, what container="" and container="POD" mean, which metric(s) are used by the kubelet to evict, why usage_bytes and max_usage_bytes might differ, the effects of quantized sampling, etc.

Comments A visual description would be amazing here, as there are hierarchical relationships that would benefit from such a view.

derekwaynecarr
derekwaynecarr
Apr
5
1 month ago
Activity icon
issue

derekwaynecarr issue comment kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

e2e flake: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write

Looks like we just got a spike of a new run failure message in master: https://storage.googleapis.com/k8s-triage/index.html?pr=1&text=unable%20to%20apply%20cgroup%20configuration&xjob=1-2

Seen in https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/109178/pull-kubernetes-conformance-kind-ga-only-parallel/1509397620936675328

s: "pod \"oidc-discovery-validator\" failed with status: {Phase:Failed Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:172.18.0.2 HostIPs:[{IP:172.18.0.2}] PodIP:10.244.1.130 PodIPs:[{IP:10.244.1.130}] StartTime:2022-03-31 05:35:20 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:oidc-discovery-validator State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:128,Signal:0,Reason:StartError,Message:failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 36721: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown,StartedAt:1970-01-01 00:00:00 +0000 UTC,FinishedAt:2022-03-31 05:35:21 +0000 UTC,ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:k8s.gcr.io/e2e-test-images/agnhost:2.36 ImageID:k8s.gcr.io/e2e-test-images/[email protected]:f5241226198f5a54d22540acf2b3933ea0f49458f90c51fc75833d0c428687b8 ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8 Started:0xc000d223ea}] QOSClass:BestEffort EphemeralContainerStatuses:[]}", } pod "oidc-discovery-validator" failed with status: {Phase:Failed Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:172.18.0.2 HostIPs:[{IP:172.18.0.2}] PodIP:10.244.1.130 PodIPs:[{IP:10.244.1.130}] StartTime:2022-03-31 05:35:20 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:oidc-discovery-validator State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:128,Signal:0,Reason:StartError,Message:failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 36721: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown,StartedAt:1970-01-01 00:00:00 +0000 UTC,FinishedAt:2022-03-31 05:35:21 +0000 UTC,ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:k8s.gcr.io/e2e-test-images/agnhost:2.36 ImageID:k8s.gcr.io/e2e-test-images/[email protected]:f5241226198f5a54d22540acf2b3933ea0f49458f90c51fc75833d0c428687b8 ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8 Started:0xc000d223ea}] QOSClass:BestEffort EphemeralContainerStatuses:[]}

/milestone v1.24 /sig node

derekwaynecarr
derekwaynecarr

@kolyshkin understood. rdma as an enabled cgroup controller on a target host for kubelet execution is what was new to me so I was wondering if there was a change to the test operating system configuration beyond just runc adding awareness.

Activity icon
issue

derekwaynecarr issue comment kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

e2e flake: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write

Looks like we just got a spike of a new run failure message in master: https://storage.googleapis.com/k8s-triage/index.html?pr=1&text=unable%20to%20apply%20cgroup%20configuration&xjob=1-2

Seen in https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/109178/pull-kubernetes-conformance-kind-ga-only-parallel/1509397620936675328

s: "pod \"oidc-discovery-validator\" failed with status: {Phase:Failed Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:172.18.0.2 HostIPs:[{IP:172.18.0.2}] PodIP:10.244.1.130 PodIPs:[{IP:10.244.1.130}] StartTime:2022-03-31 05:35:20 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:oidc-discovery-validator State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:128,Signal:0,Reason:StartError,Message:failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 36721: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown,StartedAt:1970-01-01 00:00:00 +0000 UTC,FinishedAt:2022-03-31 05:35:21 +0000 UTC,ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:k8s.gcr.io/e2e-test-images/agnhost:2.36 ImageID:k8s.gcr.io/e2e-test-images/[email protected]:f5241226198f5a54d22540acf2b3933ea0f49458f90c51fc75833d0c428687b8 ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8 Started:0xc000d223ea}] QOSClass:BestEffort EphemeralContainerStatuses:[]}", } pod "oidc-discovery-validator" failed with status: {Phase:Failed Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:172.18.0.2 HostIPs:[{IP:172.18.0.2}] PodIP:10.244.1.130 PodIPs:[{IP:10.244.1.130}] StartTime:2022-03-31 05:35:20 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:oidc-discovery-validator State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:128,Signal:0,Reason:StartError,Message:failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 36721: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown,StartedAt:1970-01-01 00:00:00 +0000 UTC,FinishedAt:2022-03-31 05:35:21 +0000 UTC,ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:k8s.gcr.io/e2e-test-images/agnhost:2.36 ImageID:k8s.gcr.io/e2e-test-images/[email protected]:f5241226198f5a54d22540acf2b3933ea0f49458f90c51fc75833d0c428687b8 ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8 Started:0xc000d223ea}] QOSClass:BestEffort EphemeralContainerStatuses:[]}

/milestone v1.24 /sig node

derekwaynecarr
derekwaynecarr

@mrunalp lets catch up on why rdma is even an available controller on this host.

rdma isnt in this allowed list:

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/cgroup_manager_linux.go#L260

Apr
4
1 month ago
Activity icon
issue

derekwaynecarr issue comment kubernetes/community

derekwaynecarr
derekwaynecarr

2021 Annual Report: SIG Architecture

/sig architecture

Signed-off-by: Davanum Srinivas [email protected]

Which issue(s) this PR fixes:

Fixes #

derekwaynecarr
derekwaynecarr

@dims this looks great.

will let @johnbelamaric put a final tag.

Mar
29
1 month ago
Activity icon
issue

derekwaynecarr issue comment kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

kubelet: check taint/toleration before accepting pods

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR adds taint/toleration check to kubelet(except for static pods)

Related PR: https://github.com/kubernetes/kubernetes/pull/100049

Which issue(s) this PR fixes:

Fixes https://github.com/kubernetes/kubernetes/issues/100408

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Kubelet now checks "NoExecute" taint/toleration before accepting pods, except for static pods.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


derekwaynecarr
derekwaynecarr

caught up with author on slack.

We found that kubelet should only care about NoExecute taint, and the scheduler should care about NoSchedule and NoExecute taint. So we decided to split taint validation logic into each component.

the above is reasonable to me.

/approve /lgtm

open pull request

derekwaynecarr wants to merge kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

kubelet: check taint/toleration before accepting pods

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR adds taint/toleration check to kubelet(except for static pods)

Related PR: https://github.com/kubernetes/kubernetes/pull/100049 Maybe we need to unify the logic but it will be quite a big refactoring.

Which issue(s) this PR fixes:

Fixes https://github.com/kubernetes/kubernetes/issues/100408

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Kubelet now checks "NoExecute" taint/toleration before accepting pods, except for static pods.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


derekwaynecarr
derekwaynecarr

is there a reason we cant move this check inside scheduler.AdmissionCheck after it validates nodeports for example?

pull request

derekwaynecarr merge to kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

kubelet: check taint/toleration before accepting pods

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR adds taint/toleration check to kubelet(except for static pods)

Related PR: https://github.com/kubernetes/kubernetes/pull/100049 Maybe we need to unify the logic but it will be quite a big refactoring.

Which issue(s) this PR fixes:

Fixes https://github.com/kubernetes/kubernetes/issues/100408

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Kubelet now checks "NoExecute" taint/toleration before accepting pods, except for static pods.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


derekwaynecarr
derekwaynecarr

the logic and test looks good, but seems like it would be better to move the check inside scheduler.AdmissionFramework itself. can you explain why that is not preferred in case I am missing something?

pull request

derekwaynecarr merge to kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

kubelet: check taint/toleration before accepting pods

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR adds taint/toleration check to kubelet(except for static pods)

Related PR: https://github.com/kubernetes/kubernetes/pull/100049 Maybe we need to unify the logic but it will be quite a big refactoring.

Which issue(s) this PR fixes:

Fixes https://github.com/kubernetes/kubernetes/issues/100408

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Kubelet now checks "NoExecute" taint/toleration before accepting pods, except for static pods.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


derekwaynecarr
derekwaynecarr

the logic and test looks good, but seems like it would be better to move the check inside scheduler.AdmissionFramework itself. can you explain why that is not preferred in case I am missing something?

Activity icon
issue

derekwaynecarr issue comment openshift/origin

derekwaynecarr
derekwaynecarr

Clarify that the image mirroring requirement applies to conformance tests only

This is my understanding after discussion with @derekwaynecarr

Activity icon
issue

derekwaynecarr issue comment kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

cpu/memory manager containerMap memory leak

What type of PR is this?

/kind bug

What this PR does / why we need it:

if cpu manager/ memory manager policy set to none, no one remove container info from containerMap, finaly lead memory leak.

this pr call cpu manager/memory manager RemoveContainer in PostStopContainer, to remove container info from containerMap

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


Mar
28
1 month ago
Activity icon
issue

derekwaynecarr issue comment kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

In-place Pod Vertical Scaling feature

What type of PR is this?

/kind feature /kind api-change

What this PR does / why we need it:

This PR brings the following changes that mostly implement In-place Pod Vertical Scaling feature:

  1. API change for In-place Pod Vertical Scaling feature
  2. Implementation of CRI API changes to support In-Place Pod Vertical Scaling. (Kubelet CRI changes KEP)
  3. Core implementation that enables In-place vertical scaling for pods, comprehensively tested with docker runtime.
  4. Comprehensive E2E tests to validate In-place pod vertical scaling feature.

Which issue(s) this PR fixes:

xref https://github.com/kubernetes/enhancements/issues/1287 xref https://github.com/kubernetes/enhancements/issues/2273

Special notes for your reviewer:

API changes: See: https://github.com/kubernetes/kubernetes/pull/102884/commits/f60522812bac3c73829fa143fb9f1864de7f15a7 https://github.com/kubernetes/kubernetes/pull/102884/commits/e0216f3d92119a60c125d12a621c862405fa8754

Scheduler changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/4eb25e03e0bb350f625cc7750cf2477b4cc0d6c0

Kubelet CRI changes and implementation: See: https://github.com/kubernetes/kubernetes/pull/102884/commits/d9862d4107ef0f16faf1770ca33a36130d0f204c https://github.com/kubernetes/kubernetes/pull/102884/commits/fc4e0b1077a1da19bd85b4baff04551eb7fa3fd7 https://github.com/kubernetes/kubernetes/pull/102884/commits/a9e69d5a54c6d1e556fa41514b9a83b86477e3e8 https://github.com/kubernetes/kubernetes/pull/102884/commits/423650a6157479c2eae4383502706671647b5e67

Does this PR introduce a user-facing change? Yes

- PodSpec.Container.Resources becomes mutable for CPU and memory resource types.
- PodSpec.Container.ResizePolicy (new object) gives users control over how their containers are resized.
- PodStatus.Resize status describes the state of a requested Pod resize.
- PodStatus.ResourcesAllocated describes node resources allocated to Pod.
- PodStatus.Resources describes node resources applied to running containers by CRI.
- UpdateContainerResources CRI API now supports both Linux and Windows.

For details, see KEPs below.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources
- [KEP]: https://github.com/vinaykul/enhancements/tree/master/keps/sig-node/2273-kubelet-container-resources-cri-api-changes
- [Usage]: via kubectl or API 
e.g kubectl patch pod bar --patch '{"spec":{"containers":[{"name":"ale", "resources":{"requests":{"memory":"500Mi"}, "limits":{"memory":"500Mi"}}}]}}'

Jun 26th: PodStatus.Resize has now been fully implemented. @thockin Please see below. I hope this cuts as as simple signal to the API user (VPA) as to what's going on with resize, so they may choose to take alternative action in the Deferred / Infeasible cases as allowed by their policy.

[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh describe no 127.0.0.1
Name:               127.0.0.1
Roles:              <none>
...
...
Addresses:
  InternalIP:  127.0.0.1
  Hostname:    127.0.0.1
Capacity:
  cpu:                16
  ephemeral-storage:  927125032Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32928300Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  854438428077
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             3465772Ki
  pods:               110
System Info:
...
Non-terminated Pods:          (1 in total)
  Namespace                   Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                        ------------  ----------  ---------------  -------------  ---
  kube-system                 coredns-66cf7947cf-zvlxf    100m (2%)     0 (0%)      70Mi (2%)        170Mi (5%)     11m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (2%)  0 (0%)
  memory             70Mi (2%)  170Mi (5%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:
...
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# cat ~/YML/2pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
spec:
  containers:
  - name: stress
    image: skiibum/ubuntu-stress:18.10
    resources:
      limits:
        cpu: "500m"
        memory: "500Mi"
      requests:
        cpu: "500m"
        memory: "500Mi"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh create -f ~/YML/2pod.yaml 
pod/2pod created
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 500m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"650m"}, "limits":{"cpu":"650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 650m
        memory: 500Mi
      requests:
        cpu: 650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: InProgress
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"3950m"}, "limits":{"cpu":"3950m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 3950m
        memory: 500Mi
      requests:
        cpu: 3950m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: Deferred
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
(failed reverse-i-search)`': cat /sys/fs/cgroup/cpu/kubepods/podd0dd7678-^Cf5-4b55-ad5d-08a384113ed4/cpu.cfs_quota_us 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"4650m"}, "limits":{"cpu":"4650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 4650m
        memory: 500Mi
      requests:
        cpu: 4650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
...
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
...
  qosClass: Guaranteed
  resize: Infeasible
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core#
derekwaynecarr
derekwaynecarr
Activity icon
issue

derekwaynecarr issue comment kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

In-place Pod Vertical Scaling feature

What type of PR is this?

/kind feature /kind api-change

What this PR does / why we need it:

This PR brings the following changes that mostly implement In-place Pod Vertical Scaling feature:

  1. API change for In-place Pod Vertical Scaling feature
  2. Implementation of CRI API changes to support In-Place Pod Vertical Scaling. (Kubelet CRI changes KEP)
  3. Core implementation that enables In-place vertical scaling for pods, comprehensively tested with docker runtime.
  4. Comprehensive E2E tests to validate In-place pod vertical scaling feature.

Which issue(s) this PR fixes:

xref https://github.com/kubernetes/enhancements/issues/1287 xref https://github.com/kubernetes/enhancements/issues/2273

Special notes for your reviewer:

API changes: See: https://github.com/kubernetes/kubernetes/pull/102884/commits/f60522812bac3c73829fa143fb9f1864de7f15a7 https://github.com/kubernetes/kubernetes/pull/102884/commits/e0216f3d92119a60c125d12a621c862405fa8754

Scheduler changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/4eb25e03e0bb350f625cc7750cf2477b4cc0d6c0

Kubelet CRI changes and implementation: See: https://github.com/kubernetes/kubernetes/pull/102884/commits/d9862d4107ef0f16faf1770ca33a36130d0f204c https://github.com/kubernetes/kubernetes/pull/102884/commits/fc4e0b1077a1da19bd85b4baff04551eb7fa3fd7 https://github.com/kubernetes/kubernetes/pull/102884/commits/a9e69d5a54c6d1e556fa41514b9a83b86477e3e8 https://github.com/kubernetes/kubernetes/pull/102884/commits/423650a6157479c2eae4383502706671647b5e67

Does this PR introduce a user-facing change? Yes

- PodSpec.Container.Resources becomes mutable for CPU and memory resource types.
- PodSpec.Container.ResizePolicy (new object) gives users control over how their containers are resized.
- PodStatus.Resize status describes the state of a requested Pod resize.
- PodStatus.ResourcesAllocated describes node resources allocated to Pod.
- PodStatus.Resources describes node resources applied to running containers by CRI.
- UpdateContainerResources CRI API now supports both Linux and Windows.

For details, see KEPs below.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources
- [KEP]: https://github.com/vinaykul/enhancements/tree/master/keps/sig-node/2273-kubelet-container-resources-cri-api-changes
- [Usage]: via kubectl or API 
e.g kubectl patch pod bar --patch '{"spec":{"containers":[{"name":"ale", "resources":{"requests":{"memory":"500Mi"}, "limits":{"memory":"500Mi"}}}]}}'

Jun 26th: PodStatus.Resize has now been fully implemented. @thockin Please see below. I hope this cuts as as simple signal to the API user (VPA) as to what's going on with resize, so they may choose to take alternative action in the Deferred / Infeasible cases as allowed by their policy.

[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh describe no 127.0.0.1
Name:               127.0.0.1
Roles:              <none>
...
...
Addresses:
  InternalIP:  127.0.0.1
  Hostname:    127.0.0.1
Capacity:
  cpu:                16
  ephemeral-storage:  927125032Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32928300Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  854438428077
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             3465772Ki
  pods:               110
System Info:
...
Non-terminated Pods:          (1 in total)
  Namespace                   Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                        ------------  ----------  ---------------  -------------  ---
  kube-system                 coredns-66cf7947cf-zvlxf    100m (2%)     0 (0%)      70Mi (2%)        170Mi (5%)     11m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (2%)  0 (0%)
  memory             70Mi (2%)  170Mi (5%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:
...
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# cat ~/YML/2pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
spec:
  containers:
  - name: stress
    image: skiibum/ubuntu-stress:18.10
    resources:
      limits:
        cpu: "500m"
        memory: "500Mi"
      requests:
        cpu: "500m"
        memory: "500Mi"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh create -f ~/YML/2pod.yaml 
pod/2pod created
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 500m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"650m"}, "limits":{"cpu":"650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 650m
        memory: 500Mi
      requests:
        cpu: 650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: InProgress
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"3950m"}, "limits":{"cpu":"3950m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 3950m
        memory: 500Mi
      requests:
        cpu: 3950m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: Deferred
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
(failed reverse-i-search)`': cat /sys/fs/cgroup/cpu/kubepods/podd0dd7678-^Cf5-4b55-ad5d-08a384113ed4/cpu.cfs_quota_us 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"4650m"}, "limits":{"cpu":"4650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 4650m
        memory: 500Mi
      requests:
        cpu: 4650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
...
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
...
  qosClass: Guaranteed
  resize: Infeasible
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core#
derekwaynecarr
derekwaynecarr

@vinaykul thanks for updates, will review.

/milestone clear

Mar
8
2 months ago
Activity icon
issue

derekwaynecarr issue comment kubernetes/enhancements

derekwaynecarr
derekwaynecarr

127: Add KEP for user namespaces support

Here is the high level work we discussed during Nov 2 SIG-node meeting.

Let me know if this KEP effectively reflects what we discussed and IIUC agreed. I'd like to know of any concerns or show-stoppers ASAP rather than last minute, after we already worked a lot of time on this.

If there are no concerns and this plan is agreed by all parties and this is merged, with @giuseppe we will start a PoC implementation to fill-in the rest of the details this KEP needs (CRI changes, etc.).

To the best of my knowledge, this incorporates all the feedback in the PR #2101 that @thockin @mtaufen and others gave (see basically that almost all is automatic and the room for improvements in phase 3, in particular the per-sa/per-ns strategies are in scope, with some others).

Please let us know what you think and thanks again for your time! :)

Tagging people that might want to get their eyes on this: @giuseppe @mrunalp @derekwaynecarr @endocrimes @rhatdan @alban @mauriciovasquezbernal

I created this with @giuseppe, who has already reviewed this.

Bellow c&p from the commit msg (see the slides linked below for a quick overview, if you prefer that! :)).


This commit adds the high level overview I proposed in the SIG-node meeting on Nov 2. The work divided in phases and intial support (phase 1 and 2) is disentangled from further improvements that community members wanted to see (phase 3).

This incorporates the valuable feedback in the discussion at PR #2101, making things as automatic as possible and adding a phase 3 for such improvements, while it also leaves room for future improvements too.

Slides used in the Nov 2 SIG-node meeting are here: https://docs.google.com/presentation/d/1z4oiZ7v4DjWpZQI2kbFbI8Q6botFaA07KJYaKA-vZpg/edit#slide=id.gc6f73a04f_0_0

Closes: #2101

Signed-off-by: Rodrigo Campos [email protected]


  • One-line PR description: Add user namespaces KEP with its skeleton and phases
  • Issue link:
  • Other comments:
derekwaynecarr
derekwaynecarr

Phase 1 looks great.

/approve /lgtm

Mar
7
2 months ago
pull request

derekwaynecarr pull request redhat-et/microshift

derekwaynecarr
derekwaynecarr

Use default GOMAXPROCS behavior

Stop setting GOMAXPROCS as more recent versions of Golang already defaults to NumCPU. Aligns with OpenShift Standalone.

see: https://pkg.go.dev/runtime#GOMAXPROCS

Activity icon
created branch

derekwaynecarr in derekwaynecarr/microshift create branch use-default-gomaxprocs

createdAt 2 months ago
pull request

derekwaynecarr pull request redhat-et/microshift

derekwaynecarr
derekwaynecarr

Enable kubelet cgroup management

Which issue(s) this PR addresses: Closes #617

Properly configure kubelet cgroup management.

All end-user workloads are partitioned under /kubepods.slice.

Global cpu/memory accounting should be enabled via systemd host configuration as part of host setup. Maybe we can ensure this is done in the RPM as a follow-up?

Activity icon
created branch

derekwaynecarr in derekwaynecarr/microshift create branch enable-kubepods-cgroups

createdAt 2 months ago
Mar
4
2 months ago
Activity icon
fork

derekwaynecarr forked redhat-et/microshift

⚡ A small form factor OpenShift/Kubernetes optimized for edge computing
derekwaynecarr Apache License 2.0 Updated
fork time in 2 months ago
Activity icon
issue

derekwaynecarr issue redhat-et/microshift

derekwaynecarr
derekwaynecarr

[BUG] MicroShift is not parenting user workload in expected part of cgroup tree

What happened:

MicroShift has configured kubelet to launch containers under system.slice rather than kubepods.slice.

What you expected to happen:

I expected kubelet to launch with the following parameters:

--cgroups-per-qos=true
--cgroup-driver=systemd
--cgroup-root=kubepods

This ensures that all cluster workload is isolated from system.slice and ensures that cfs_shares are evaluated local to other pods properly, as well as makes it easier to understand the memory_usage for all pods running on the platform.

How to reproduce it (as minimally and precisely as possible):

Launch system using vagrant 34 VM using RPM based install method for MicroShift.

Anything else we need to know?:

Mar
2
2 months ago
open pull request

derekwaynecarr wants to merge openshift/hypershift

derekwaynecarr
derekwaynecarr

add pause annotation support for hostedcluster controller, hostedclusterconfig controller, and hostedcontrolplane controller

This pr adds pause annotation support to the hostedcluster controller, hostedclusterconfig controller, and hostedcontrolplane controller. Equivalent support for nodePools will be added in an independent pr. In this architecture someone is able to set the pauseRecociliation annotation to either a date or a boolean value of true. If it is set to true: reconciliation will be paused until the annotation is removed. If it is set to a RFC3339 formatted date: reconciliation is paused till that date is passed at which point reconciliation will begin again.

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story: Fixes #1046

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.
derekwaynecarr
derekwaynecarr

an argument for spec/status is that we can report via condition that the associated controllers are actually paused. with an annotation alone, dont we have to get under the hood and actually verify the controllers were stopped?

pull request

derekwaynecarr merge to openshift/hypershift

derekwaynecarr
derekwaynecarr

add pause annotation support for hostedcluster controller, hostedclusterconfig controller, and hostedcontrolplane controller

This pr adds pause annotation support to the hostedcluster controller, hostedclusterconfig controller, and hostedcontrolplane controller. Equivalent support for nodePools will be added in an independent pr. In this architecture someone is able to set the pauseRecociliation annotation to either a date or a boolean value of true. If it is set to true: reconciliation will be paused until the annotation is removed. If it is set to a RFC3339 formatted date: reconciliation is paused till that date is passed at which point reconciliation will begin again.

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story: Fixes #1046

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.
Mar
1
2 months ago
open pull request

derekwaynecarr wants to merge kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

In-place Pod Vertical Scaling feature

What type of PR is this?

/kind feature /kind api-change

What this PR does / why we need it:

This PR brings the following changes that mostly implement In-place Pod Vertical Scaling feature:

  1. API change for In-place Pod Vertical Scaling feature
  2. Implementation of CRI API changes to support In-Place Pod Vertical Scaling. (Kubelet CRI changes KEP)
  3. Core implementation that enables In-place vertical scaling for pods, comprehensively tested with docker runtime.
  4. Comprehensive E2E tests to validate In-place pod vertical scaling feature.

Which issue(s) this PR fixes:

xref https://github.com/kubernetes/enhancements/issues/1287 xref https://github.com/kubernetes/enhancements/issues/2273

Special notes for your reviewer:

API changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/993e73c1e0aa632c3fbadac48287bb6b687f11ed Scheduler changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/d1f43022ba3219ff18114d5c4770b1ff6370daaa Kubelet CRI changes and implementation: See https://github.com/kubernetes/kubernetes/pull/102884/commits/f191ef26245cea36e4ef61ad7867e4da83998475 + https://github.com/kubernetes/kubernetes/pull/102884/commits/4bc0765f0857b77af98fdabca4d143f4d49ca1e9

Does this PR introduce a user-facing change? Yes

- PodSpec.Container.Resources becomes mutable for CPU and memory resource types.
- PodSpec.Container.ResizePolicy (new object) gives users control over how their containers are resized.
- PodStatus.Resize status describes the state of a requested Pod resize.
- PodStatus.ResourcesAllocated describes node resources allocated to Pod.
- PodStatus.Resources describes node resources applied to running containers by CRI.
- UpdateContainerResources CRI API now supports both Linux and Windows.

For details, see KEPs below.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources
- [KEP]: https://github.com/vinaykul/enhancements/tree/master/keps/sig-node/2273-kubelet-container-resources-cri-api-changes
- [Usage]: via kubectl or API 
e.g kubectl patch pod bar --patch '{"spec":{"containers":[{"name":"ale", "resources":{"requests":{"memory":"500Mi"}, "limits":{"memory":"500Mi"}}}]}}'

Jun 26th: PodStatus.Resize has now been fully implemented. @thockin Please see below. I hope this cuts as as simple signal to the API user (VPA) as to what's going on with resize, so they may choose to take alternative action in the Deferred / Infeasible cases as allowed by their policy.

[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh describe no 127.0.0.1
Name:               127.0.0.1
Roles:              <none>
...
...
Addresses:
  InternalIP:  127.0.0.1
  Hostname:    127.0.0.1
Capacity:
  cpu:                16
  ephemeral-storage:  927125032Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32928300Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  854438428077
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             3465772Ki
  pods:               110
System Info:
...
Non-terminated Pods:          (1 in total)
  Namespace                   Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                        ------------  ----------  ---------------  -------------  ---
  kube-system                 coredns-66cf7947cf-zvlxf    100m (2%)     0 (0%)      70Mi (2%)        170Mi (5%)     11m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (2%)  0 (0%)
  memory             70Mi (2%)  170Mi (5%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:
...
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# cat ~/YML/2pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
spec:
  containers:
  - name: stress
    image: skiibum/ubuntu-stress:18.10
    resources:
      limits:
        cpu: "500m"
        memory: "500Mi"
      requests:
        cpu: "500m"
        memory: "500Mi"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh create -f ~/YML/2pod.yaml 
pod/2pod created
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 500m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"650m"}, "limits":{"cpu":"650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 650m
        memory: 500Mi
      requests:
        cpu: 650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: InProgress
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"3950m"}, "limits":{"cpu":"3950m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 3950m
        memory: 500Mi
      requests:
        cpu: 3950m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: Deferred
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
(failed reverse-i-search)`': cat /sys/fs/cgroup/cpu/kubepods/podd0dd7678-^Cf5-4b55-ad5d-08a384113ed4/cpu.cfs_quota_us 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"4650m"}, "limits":{"cpu":"4650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 4650m
        memory: 500Mi
      requests:
        cpu: 4650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
...
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
...
  qosClass: Guaranteed
  resize: Infeasible
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core#
derekwaynecarr
derekwaynecarr

I am trying to understand the motivation for this section. Can you explain what metric updating the PLEG cache here improved rather than wait for the next relist interval? Is some information lost that I missed by not alerting the PLEG and therefore this invocation was required? I ask because @mrunalp and a few others are looking at runtime-native watch support, and I would expect the runtime to be able to inotify back the resource change that may have happened without this intersection point.

open pull request

derekwaynecarr wants to merge kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

In-place Pod Vertical Scaling feature

What type of PR is this?

/kind feature /kind api-change

What this PR does / why we need it:

This PR brings the following changes that mostly implement In-place Pod Vertical Scaling feature:

  1. API change for In-place Pod Vertical Scaling feature
  2. Implementation of CRI API changes to support In-Place Pod Vertical Scaling. (Kubelet CRI changes KEP)
  3. Core implementation that enables In-place vertical scaling for pods, comprehensively tested with docker runtime.
  4. Comprehensive E2E tests to validate In-place pod vertical scaling feature.

Which issue(s) this PR fixes:

xref https://github.com/kubernetes/enhancements/issues/1287 xref https://github.com/kubernetes/enhancements/issues/2273

Special notes for your reviewer:

API changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/993e73c1e0aa632c3fbadac48287bb6b687f11ed Scheduler changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/d1f43022ba3219ff18114d5c4770b1ff6370daaa Kubelet CRI changes and implementation: See https://github.com/kubernetes/kubernetes/pull/102884/commits/f191ef26245cea36e4ef61ad7867e4da83998475 + https://github.com/kubernetes/kubernetes/pull/102884/commits/4bc0765f0857b77af98fdabca4d143f4d49ca1e9

Does this PR introduce a user-facing change? Yes

- PodSpec.Container.Resources becomes mutable for CPU and memory resource types.
- PodSpec.Container.ResizePolicy (new object) gives users control over how their containers are resized.
- PodStatus.Resize status describes the state of a requested Pod resize.
- PodStatus.ResourcesAllocated describes node resources allocated to Pod.
- PodStatus.Resources describes node resources applied to running containers by CRI.
- UpdateContainerResources CRI API now supports both Linux and Windows.

For details, see KEPs below.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources
- [KEP]: https://github.com/vinaykul/enhancements/tree/master/keps/sig-node/2273-kubelet-container-resources-cri-api-changes
- [Usage]: via kubectl or API 
e.g kubectl patch pod bar --patch '{"spec":{"containers":[{"name":"ale", "resources":{"requests":{"memory":"500Mi"}, "limits":{"memory":"500Mi"}}}]}}'

Jun 26th: PodStatus.Resize has now been fully implemented. @thockin Please see below. I hope this cuts as as simple signal to the API user (VPA) as to what's going on with resize, so they may choose to take alternative action in the Deferred / Infeasible cases as allowed by their policy.

[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh describe no 127.0.0.1
Name:               127.0.0.1
Roles:              <none>
...
...
Addresses:
  InternalIP:  127.0.0.1
  Hostname:    127.0.0.1
Capacity:
  cpu:                16
  ephemeral-storage:  927125032Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32928300Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  854438428077
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             3465772Ki
  pods:               110
System Info:
...
Non-terminated Pods:          (1 in total)
  Namespace                   Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                        ------------  ----------  ---------------  -------------  ---
  kube-system                 coredns-66cf7947cf-zvlxf    100m (2%)     0 (0%)      70Mi (2%)        170Mi (5%)     11m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (2%)  0 (0%)
  memory             70Mi (2%)  170Mi (5%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:
...
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# cat ~/YML/2pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
spec:
  containers:
  - name: stress
    image: skiibum/ubuntu-stress:18.10
    resources:
      limits:
        cpu: "500m"
        memory: "500Mi"
      requests:
        cpu: "500m"
        memory: "500Mi"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh create -f ~/YML/2pod.yaml 
pod/2pod created
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]fw0000359:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 500m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"650m"}, "limits":{"cpu":"650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 650m
        memory: 500Mi
      requests:
        cpu: 650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: InProgress
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"3950m"}, "limits":{"cpu":"3950m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 3950m
        memory: 500Mi
      requests:
        cpu: 3950m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: Deferred
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
(failed reverse-i-search)`': cat /sys/fs/cgroup/cpu/kubepods/podd0dd7678-^Cf5-4b55-ad5d-08a384113ed4/cpu.cfs_quota_us 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"4650m"}, "limits":{"cpu":"4650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 4650m
        memory: 500Mi
      requests:
        cpu: 4650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
...
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
...
  qosClass: Guaranteed
  resize: Infeasible
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core#
derekwaynecarr
derekwaynecarr

I am not seeing any changes to state in present form of this function.

A follow-up for me is ensuring that status.Resources() actually includes all resources, not just cpu,memory,ephemeral-storage, but also includes any extended resources that may have been present on the pod spec.

TODO for Derek see if there is a test case here that includes extended resources.

open pull request

derekwaynecarr wants to merge kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

In-place Pod Vertical Scaling feature

What type of PR is this?

/kind feature /kind api-change

What this PR does / why we need it:

This PR brings the following changes that mostly implement In-place Pod Vertical Scaling feature:

  1. API change for In-place Pod Vertical Scaling feature
  2. Implementation of CRI API changes to support In-Place Pod Vertical Scaling. (Kubelet CRI changes KEP)
  3. Core implementation that enables In-place vertical scaling for pods, comprehensively tested with docker runtime.
  4. Comprehensive E2E tests to validate In-place pod vertical scaling feature.

Which issue(s) this PR fixes:

xref https://github.com/kubernetes/enhancements/issues/1287 xref https://github.com/kubernetes/enhancements/issues/2273

Special notes for your reviewer:

API changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/993e73c1e0aa632c3fbadac48287bb6b687f11ed Scheduler changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/d1f43022ba3219ff18114d5c4770b1ff6370daaa Kubelet CRI changes and implementation: See https://github.com/kubernetes/kubernetes/pull/102884/commits/f191ef26245cea36e4ef61ad7867e4da83998475 + https://github.com/kubernetes/kubernetes/pull/102884/commits/4bc0765f0857b77af98fdabca4d143f4d49ca1e9

Does this PR introduce a user-facing change? Yes

- PodSpec.Container.Resources becomes mutable for CPU and memory resource types.
- PodSpec.Container.ResizePolicy (new object) gives users control over how their containers are resized.
- PodStatus.Resize status describes the state of a requested Pod resize.
- PodStatus.ResourcesAllocated describes node resources allocated to Pod.
- PodStatus.Resources describes node resources applied to running containers by CRI.
- UpdateContainerResources CRI API now supports both Linux and Windows.

For details, see KEPs below.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources
- [KEP]: https://github.com/vinaykul/enhancements/tree/master/keps/sig-node/2273-kubelet-container-resources-cri-api-changes
- [Usage]: via kubectl or API 
e.g kubectl patch pod bar --patch '{"spec":{"containers":[{"name":"ale", "resources":{"requests":{"memory":"500Mi"}, "limits":{"memory":"500Mi"}}}]}}'

Jun 26th: PodStatus.Resize has now been fully implemented. @thockin Please see below. I hope this cuts as as simple signal to the API user (VPA) as to what's going on with resize, so they may choose to take alternative action in the Deferred / Infeasible cases as allowed by their policy.

[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh describe no 127.0.0.1
Name:               127.0.0.1
Roles:              <none>
...
...
Addresses:
  InternalIP:  127.0.0.1
  Hostname:    127.0.0.1
Capacity:
  cpu:                16
  ephemeral-storage:  927125032Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32928300Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  854438428077
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             3465772Ki
  pods:               110
System Info:
...
Non-terminated Pods:          (1 in total)
  Namespace                   Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                        ------------  ----------  ---------------  -------------  ---
  kube-system                 coredns-66cf7947cf-zvlxf    100m (2%)     0 (0%)      70Mi (2%)        170Mi (5%)     11m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (2%)  0 (0%)
  memory             70Mi (2%)  170Mi (5%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:
...
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# cat ~/YML/2pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
spec:
  containers:
  - name: stress
    image: skiibum/ubuntu-stress:18.10
    resources:
      limits:
        cpu: "500m"
        memory: "500Mi"
      requests:
        cpu: "500m"
        memory: "500Mi"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh create -f ~/YML/2pod.yaml 
pod/2pod created
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 500m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"650m"}, "limits":{"cpu":"650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 650m
        memory: 500Mi
      requests:
        cpu: 650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: InProgress
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"3950m"}, "limits":{"cpu":"3950m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 3950m
        memory: 500Mi
      requests:
        cpu: 3950m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: Deferred
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
(failed reverse-i-search)`': cat /sys/fs/cgroup/cpu/kubepods/podd0dd7678-^Cf5-4b55-ad5d-08a384113ed4/cpu.cfs_quota_us 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"4650m"}, "limits":{"cpu":"4650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 4650m
        memory: 500Mi
      requests:
        cpu: 4650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
...
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
...
  qosClass: Guaranteed
  resize: Infeasible
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core#
derekwaynecarr
derekwaynecarr

future idea: it would be good to have a helper function that says "GetResizableContainersForPod` that localizes the logic for primary containers and ignores init containers or similar so its easier for future reviewers to recall.

open pull request

derekwaynecarr wants to merge kubernetes/kubernetes

derekwaynecarr
derekwaynecarr

In-place Pod Vertical Scaling feature

What type of PR is this?

/kind feature /kind api-change

What this PR does / why we need it:

This PR brings the following changes that mostly implement In-place Pod Vertical Scaling feature:

  1. API change for In-place Pod Vertical Scaling feature
  2. Implementation of CRI API changes to support In-Place Pod Vertical Scaling. (Kubelet CRI changes KEP)
  3. Core implementation that enables In-place vertical scaling for pods, comprehensively tested with docker runtime.
  4. Comprehensive E2E tests to validate In-place pod vertical scaling feature.

Which issue(s) this PR fixes:

xref https://github.com/kubernetes/enhancements/issues/1287 xref https://github.com/kubernetes/enhancements/issues/2273

Special notes for your reviewer:

API changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/993e73c1e0aa632c3fbadac48287bb6b687f11ed Scheduler changes: See https://github.com/kubernetes/kubernetes/pull/102884/commits/d1f43022ba3219ff18114d5c4770b1ff6370daaa Kubelet CRI changes and implementation: See https://github.com/kubernetes/kubernetes/pull/102884/commits/f191ef26245cea36e4ef61ad7867e4da83998475 + https://github.com/kubernetes/kubernetes/pull/102884/commits/4bc0765f0857b77af98fdabca4d143f4d49ca1e9

Does this PR introduce a user-facing change? Yes

- PodSpec.Container.Resources becomes mutable for CPU and memory resource types.
- PodSpec.Container.ResizePolicy (new object) gives users control over how their containers are resized.
- PodStatus.Resize status describes the state of a requested Pod resize.
- PodStatus.ResourcesAllocated describes node resources allocated to Pod.
- PodStatus.Resources describes node resources applied to running containers by CRI.
- UpdateContainerResources CRI API now supports both Linux and Windows.

For details, see KEPs below.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources
- [KEP]: https://github.com/vinaykul/enhancements/tree/master/keps/sig-node/2273-kubelet-container-resources-cri-api-changes
- [Usage]: via kubectl or API 
e.g kubectl patch pod bar --patch '{"spec":{"containers":[{"name":"ale", "resources":{"requests":{"memory":"500Mi"}, "limits":{"memory":"500Mi"}}}]}}'

Jun 26th: PodStatus.Resize has now been fully implemented. @thockin Please see below. I hope this cuts as as simple signal to the API user (VPA) as to what's going on with resize, so they may choose to take alternative action in the Deferred / Infeasible cases as allowed by their policy.

[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh describe no 127.0.0.1
Name:               127.0.0.1
Roles:              <none>
...
...
Addresses:
  InternalIP:  127.0.0.1
  Hostname:    127.0.0.1
Capacity:
  cpu:                16
  ephemeral-storage:  927125032Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32928300Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  854438428077
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             3465772Ki
  pods:               110
System Info:
...
Non-terminated Pods:          (1 in total)
  Namespace                   Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                        ------------  ----------  ---------------  -------------  ---
  kube-system                 coredns-66cf7947cf-zvlxf    100m (2%)     0 (0%)      70Mi (2%)        170Mi (5%)     11m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (2%)  0 (0%)
  memory             70Mi (2%)  170Mi (5%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:
...
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# cat ~/YML/2pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
spec:
  containers:
  - name: stress
    image: skiibum/ubuntu-stress:18.10
    resources:
      limits:
        cpu: "500m"
        memory: "500Mi"
      requests:
        cpu: "500m"
        memory: "500Mi"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh create -f ~/YML/2pod.yaml 
pod/2pod created
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 500m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"650m"}, "limits":{"cpu":"650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 650m
        memory: 500Mi
      requests:
        cpu: 650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: InProgress
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"3950m"}, "limits":{"cpu":"3950m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 3950m
        memory: 500Mi
      requests:
        cpu: 3950m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
    started: true
...
  qosClass: Guaranteed
  resize: Deferred
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
(failed reverse-i-search)`': cat /sys/fs/cgroup/cpu/kubepods/podd0dd7678-^Cf5-4b55-ad5d-08a384113ed4/cpu.cfs_quota_us 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh patch pod 2pod --patch '{"spec":{"containers":[{"name":"stress", "resources":{"requests":{"cpu":"4650m"}, "limits":{"cpu":"4650m"}}}]}}'
pod/2pod patched
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# ./cluster/kubectl.sh get po 2pod -oyaml 
apiVersion: v1
kind: Pod
metadata:
  name: 2pod
  namespace: default
spec:
  containers:
  - image: skiibum/ubuntu-stress:18.10
    name: stress
    resizePolicy:
    - policy: RestartNotRequired
      resourceName: cpu
    - policy: RestartNotRequired
      resourceName: memory
    resources:
      limits:
        cpu: 4650m
        memory: 500Mi
      requests:
        cpu: 4650m
        memory: 500Mi
...
...
status:
  conditions:
...
  containerStatuses:
  - containerID: docker://015b2d8605c732329129a8d61894ef5438b5a8ed09da0b5e56dad82d3b57a789
    image: skiibum/ubuntu-stress:18.10
...
    name: stress
    ready: true
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    resourcesAllocated:
      cpu: 650m
      memory: 500Mi
    restartCount: 0
...
  qosClass: Guaranteed
  resize: Infeasible
  startTime: "2021-06-27T02:06:56Z"
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core# 
[email protected]:~/go/src/k8s.io/kubernetes-rfpvs-core#
derekwaynecarr
derekwaynecarr

note to Derek for follow-up: In HandlePodUpdate and HandlePodReconcile prior to this invocation, we are calling UpdatePod as well using the source from the config source independent of what happens in syncPod loops. I guess its possible this could flip ResizeStatus until the kubelet actually writes pod status back to kube-api.

Previous