smarterclayton

smarterclayton

Kubernetes maintainer and general repair bot

Member Since 10 years ago

North Carolina

Experience Points
1.2k
follower
Lessons Completed
0
follow
Lessons Completed
282
stars
Best Reply Awards
195
repos

233 contributions in the last year

Pinned
⚡ Container Cluster Manager
⚡ OpenShift 3 - build, deploy, and manage your applications with Docker and Kubernetes
⚡ OpenShift Ansible Code
⚡ vagrant-openshift
Activity
May
21
2 days ago
Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

dockershim takes 1h30m to successfully kill a pod in node-serial tests

In https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-serial/1420578748725465088 E2eNode Suite: [sig-node] Restart [Serial] [Slow] [Disruptive] [NodeFeature:ContainerRuntimeRestart] Container Runtime Network should recover from ip leak expand_more is failing because it takes 1h30m to terminate the pod test-e7842f3f-c74c-4285-9b78-3d26c9d53bac. Looking through logs we attempt to call syncTerminatingPod 86 times and each time it failed. Grepping for the error returned

 E0729 03:54:43.316147   48428 pod_workers.go:747] "Error syncing pod, skipping" err="detected running containers after a successful KillPod, CRI violation: [docker://57b03a8780ccc5c862e96d16c5ebc39fecffb110b10d00f72d390d3931659c32]" pod="restart-test-3493/test-e7842f3f-c74c-4285-9b78-3d26c9d53bac" podUID=512e8f6d-1557-44f5-b3b1-9ff6745c4992
...
 E0729 05:19:52.464196  117608 pod_workers.go:747] "Error syncing pod, skipping" err="detected running containers after a successful KillPod, CRI violation: [docker://57b03a8780ccc5c862e96d16c5ebc39fecffb110b10d00f72d390d3931659c32]" pod="restart-test-3493/test-e7842f3f-c74c-4285-9b78-3d26c9d53bac" podUID=512e8f6d-1557-44f5-b3b1-9ff6745c4992
 E0729 05:20:14.770915  117608 pod_workers.go:747] "Error syncing pod, skipping" err="[failed to \"KillContainer\" for \"test-e7842f3f-c74c-4285-9b78-3d26c9d53bac\" with KillContainerError: \"rpc error: code = Unknown desc = error during connect: Post \\\"http://%2Fvar%2Frun%2Fdocker.sock/v1.40/containers/57b03a8780ccc5c862e96d16c5ebc39fecffb110b10d00f72d390d3931659c32/stop?t=30\\\": EOF\", failed to \"KillPodSandbox\" for \"512e8f6d-1557-44f5-b3b1-9ff6745c4992\" with KillPodSandboxError: \"rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?\"]" pod="restart-test-3493/test-e7842f3f-c74c-4285-9b78-3d26c9d53bac" podUID=512e8f6d-1557-44f5-b3b1-9ff6745c4992

which is because of

 I0729 05:19:52.464131  117608 kubelet.go:1813] "Post-termination container state" pod="restart-test-3493/test-e7842f3f-c74c-4285-9b78-3d26c9d53bac" podUID=512e8f6d-1557-44f5-b3b1-9ff6745c4992 containers="(test-e7842f3f-c74c-4285-9b78-3d26c9d53bac state=running exitCode=0 finishedAt=0001-01-01T00:00:00Z)"
 I0729 05:20:15.958943  117608 kubelet.go:1813] "Post-termination container state" pod="restart-test-3493/test-e7842f3f-c74c-4285-9b78-3d26c9d53bac" podUID=512e8f6d-1557-44f5-b3b1-9ff6745c4992 containers="(test-e7842f3f-c74c-4285-9b78-3d26c9d53bac state=exited exitCode=0 finishedAt=2021-07-29T03:10:02.668575184Z)"

which means that kill pod in dockershim was unable to kill a container.

This would probably be a release blocker, but I don't know that it's a regression.

/kind bug /sig node

smarterclayton
smarterclayton
May
12
1 week ago
Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

Pods with failed status IP address reused on new pods, but traffic still going to old pods across namespaces.

What happened?

Since upgrading to v1.22.7-gke.1500 in GKE cluster, we have had customer reports that traffic from one namespace is going to another namespace.

After investigating these reports. We found containers with these status's: OutOfmemory, Terminated, ContainerStatusUnknown, OOMKilled, OutOfcpu. Their IP address is given to a new pod in a different name space, the pods that then have the status's above are still running and serving requests. The only way to resolve it is for us to manually delete the container, which then immediately resolves the issue.

We are not sure where to start looking, we assume that these containers are still running on the nodes, although we havent seen them running on the node when using this command: crictl ps

However they must be running as when we have this issue, the web application loads fully on that namespace.

What did you expect to happen?

For the pods to be deleted, and if there IP address is reused, for the IP address not to be routed to the old pod anymore.

How can we reproduce it (as minimally and precisely as possible)?

Start a GKE cluster using OS: Container-Optimized OS, on version: v1.22.7-gke.1500

Achieve one of the following status's on a pod: OutOfmemory Terminated ContainerStatusUnknown OOMKilled OutOfcpu

Observe that a new pod in cluster is assigned the same IP address as the pod with one of the above failed status's, and that old pod is still serving requests on the now reused IP address.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
v1.22.7-gke.1500

Cloud provider

GKE

OS version

# On Linux:
$ cat /etc/os-release
NAME="Container-Optimized OS"
ID=cos
PRETTY_NAME="Container-Optimized OS from Google"
HOME_URL="https://cloud.google.com/container-optimized-os/docs"
BUG_REPORT_URL="https://cloud.google.com/container-optimized-os/docs/resources/support-policy#contact_us"
KERNEL_COMMIT_ID=ccbab0481cec29d7f07947bcb6255f325b88513f
GOOGLE_CRASH_ID=Lakitu
GOOGLE_METRICS_PRODUCT_ID=26
VERSION=93
VERSION_ID=93
BUILD_ID=16623.102.23
$ uname -a
Linux gke-cf-europe-west2-cluster-ego-c2-pm-08b6d578-8kz5 5.10.90+ #1 SMP Sat Mar 5 10:09:49 UTC 2022 x86_64 Intel(R) Xeon(R) CPU @ 3.10GHz GenuineIntel GNU/Linux

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

smarterclayton
smarterclayton

Are CNI plug-ins depending on pod status ips as the authoritative record of allocation? Also, do we formally define CNI destroy as happening before that release?

The answer to those two questions is required to correctly determine where in pod shutdown logic needs to be added (and this is another kubelet e2e test we need to add). Ie if the second is no, we can clear the status podIPs once the pod containers are confirmed shutdown. If the second is yes, we have to defer the final pod status update and the clear until after cni destroy is guaranteed to succeed (that has other safety implications).

May
5
2 weeks ago
Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

Promote Batchv1JobLifecycleTest +4 Endpoints

What type of PR is this? /kind cleanup

What this PR does / why we need it: This PR adds a test to test the following untested endpoints:

  • patchBatchV1NamespacedJob
  • replaceBatchV1NamespacedJob
  • deleteBatchV1CollectionNamespacedJob
  • listBatchV1JobForAllNamespaces

Which issue(s) this PR fixes: Fixes #108641

Testgrid Link: Batchv1JobLifecycleTest Testgrid

Special notes for your reviewer: Adds +4 endpoint test coverage (good for conformance)

Does this PR introduce a user-facing change?:

NONE

Release note:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NONE

/sig testing /sig architecture /area conformance

Apr
6
1 month ago
open pull request

smarterclayton wants to merge kubernetes/enhancements

smarterclayton
smarterclayton

at this point we will now have incrementally implemented 90% of etcd, vs 50%.

So the first meta question I want to discuss is whether the final shape is our idealized version of storage (a reference in memory version), and if so whether our approach is building towards something we should formally define. Second is whether the etcd proxy / core etcd data structures does a better job of this than the approach we are taking. Third will be identifying the core data structure tradeoff - is the btree the ideal structure for us vs other types (why you started here), or what key semantics are “must support”, so that we can decide whether the data structure is as minimal as possible.

Apr
5
1 month ago
Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

Promote Read, Replace, Patch BatchV1NamespacedJobStatus test - +3 endpoints

What type of PR is this? /kind cleanup

What this PR does / why we need it: This PR adds a test to test the following untested endpoints:

  • replaceBatchV1NamespacedJobStatus
  • readBatchV1NamespacedJobStatus
  • patchBatchV1NamespacedJobStatus

Which issue(s) this PR fixes: Fixes #108113

Testgrid Link: testgrid-link

Special notes for your reviewer: Adds +3 endpoint test coverage (good for conformance)

Does this PR introduce a user-facing change?:

NONE

Release note:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NONE

/sig testing /sig architecture /area conformance

Apr
4
1 month ago
Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

client-go: make retry in Request thread safe

What type of PR is this?

/kind bug

What this PR does / why we need it:

we previously guaranteed the thread safety of methods called on Request in client go, the retry interface introduced is a member variable and not thread safe. This PR introduces a factory function that returns a retry interface inside Watch, Do, DoRaw, and Stream making it thread safe as it was before.

Please note there are other member variables in Request that are not thread safe today, this PR does not address that.

Which issue(s) this PR fixes:

Fixes #109155

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


smarterclayton
smarterclayton

/lgtm /approve

After 1.24 i’d like to update the godoc of client go to make thread safety obvious, add a test that verifies client behavior in the presence of retries (such that the race detector should flag us.

open pull request

smarterclayton wants to merge kubernetes/kubernetes

smarterclayton
smarterclayton

client-go: make retry in Request thread safe

What type of PR is this?

/kind bug

What this PR does / why we need it:

we previously guaranteed the thread safety of methods called on Request in client go, the retry interface introduced is a member variable and not thread safe. This PR introduces a factory function that returns a retry interface inside Watch, Do, DoRaw, and Stream making it thread safe as it was before.

Please note there are other member variables in Request that are not thread safe today, this PR does not address that.

Which issue(s) this PR fixes:

Fixes #109155

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


smarterclayton
smarterclayton

I’d prefer not to use factory (not part of style).

Generally, this would be:

retryFn requestRetryFunc

(we use Fn for variables and Func for types.

Also godoc describe this as for testing

pull request

smarterclayton merge to kubernetes/kubernetes

smarterclayton
smarterclayton

client-go: make retry in Request thread safe

What type of PR is this?

/kind bug

What this PR does / why we need it:

we previously guaranteed the thread safety of methods called on Request in client go, the retry interface introduced is a member variable and not thread safe. This PR introduces a factory function that returns a retry interface inside Watch, Do, DoRaw, and Stream making it thread safe as it was before.

Please note there are other member variables in Request that are not thread safe today, this PR does not address that.

Which issue(s) this PR fixes:

Fixes #109155

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

client-go: make retry in Request thread safe

What type of PR is this?

/kind bug

What this PR does / why we need it:

we previously guaranteed the thread safety of methods called on Request in client go, the retry interface introduced is a member variable and not thread safe. This PR introduces a factory function that returns a retry interface inside Watch, Do, DoRaw, and Stream making it thread safe as it was before.

Please note there are other member variables in Request that are not thread safe today, this PR does not address that.

Which issue(s) this PR fixes:

Fixes #109155

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


smarterclayton
smarterclayton

I’m going to try to get to tonight but other stuff has been blocking me.

Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

Promote Batchv1JobLifecycleTest & BatchV1NamespacedJobStatus test +7 endpoint coverage

What type of PR is this? /kind cleanup

What this PR does / why we need it: This PR adds a test to test the following untested endpoints:

  • patchBatchV1NamespacedJob
  • replaceBatchV1NamespacedJob
  • deleteBatchV1CollectionNamespacedJob
  • listBatchV1JobForAllNamespaces
  • replaceBatchV1NamespacedJobStatus
  • readBatchV1NamespacedJobStatus
  • patchBatchV1NamespacedJobStatus

Which issue(s) this PR fixes: Fixes #108113 Fixes #108641

Testgrid Link: Batchv1JobLifecycleTest Testgrid

Testgrid Link: BatchV1NamespacedJobStatus testgrid-link

Special notes for your reviewer: Adds +7 endpoint test coverage (good for conformance)

Does this PR introduce a user-facing change?:

NONE

Release note:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NONE

/sig testing /sig architecture /area conformance

smarterclayton
smarterclayton
		err = retry.RetryOnConflict(retry.DefaultRetry, func() error {
			patchedJob, err = jobClient.Get(context.TODO(), jobName, metav1.GetOptions{})
			framework.ExpectNoError(err, "Unable to get job %s", jobName)
			patchedJob.Spec.Suspend = pointer.BoolPtr(false)
			patchedJob.Annotations["updated"] = "true"

                        ^ you can't assume annotations is set, you need to initialize it to an empty map if it's nil first

			updatedJob, err = e2ejob.UpdateJob(f.ClientSet, ns, patchedJob)
			return err
		})

So that's a bug in the test that would fix the flake.

Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

csi: add nodeExpandSecret support for CSI client & add unit test

CSI spec 1.5 enhanced the spec to add optional secrets field to NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the CSI PV source and also derive the expansion secret in csiclient to send it out as part of the nodeexpand request.

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable: /kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #95367

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This release add support for NodeExpandSecret for CSI driver client which enables the CSI drivers to make use of this secret while performing node expansion operation based on the user request. Previously there was no secret  provided as part of the nodeexpansion call, thus CSI drivers were not make use of the same while expanding the volume at node side. 

KEP reference:https://github.com/kubernetes/enhancements/pull/3173/

smarterclayton
smarterclayton

/approve

for api changes and field gate

Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

rest: Ensure response body is fully read and closed before retry

This commit refactors the retry logic to include resetting the request body. The reset logic will be called iff it is not the first attempt. This refactor is nescessary mainly because now as per the retry logic, we always ensure that the request body is reset after the response body is fully read and closed in order to reuse the same TCP connection.

Previously, the reset of the request body and the call to read and close the response body were not in the right order, which leads to race conditions.

xref https://github.com/kubernetes/kubernetes/issues/108906

Fixes a bug in our client retry logic that was not draining/closing the body prior to trying to reset the request body for a retry. According to go upstream, this is required to avoid races.

Does this PR introduce a user-facing change?

client-go: if resetting the body fails before a retry, an error is now surfaced to the user.

Continuing https://github.com/kubernetes/kubernetes/pull/109028

/priority critical-urgent /assign @aojea @liggitt @smarterclayton /cc @dims @tkashem would something like this work?

smarterclayton
smarterclayton

would be good to see that test added, this is lgtm from my perspective, but I’ll let one other person do the tag when the test is added (@tkashem probably)

/approve

Apr
1
1 month ago
open pull request

smarterclayton wants to merge kubernetes/kubernetes

smarterclayton
smarterclayton

kubelet: Add GracefulNodeShutdownPodPolicy config option to graceful node shutdown

What type of PR is this?

/kind feature

What this PR does / why we need it:

Graceful node shutdown currently places pods into terminal phase upon shutdown. As discussed in https://github.com/kubernetes/kubernetes/issues/104531#issuecomment-982763592 and https://github.com/kubernetes/kubernetes/issues/104531#issuecomment-982766908 some users/distributions would benefit from the ability to toggle this behavior and instead to not put pods into terminal phase on shutdown.

For example, if it's expected that after shutdown the node will reboot, it may be desirable to not place the pods into terminal phase, so that way after reboot the pods will start running again after the node comes back up.

This PR adds a new kubelet configuration option to toggle this behavior GracefulNodeShutdownPodPolicy. The default setting of GracefulNodeShutdownPodPolicy is enabled SetTerminal (which matches current behavior since 1.20) when graceful node shutdown was introduced.

Which issue(s) this PR fixes:

Fixes #108991

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Add a new kubelet configuration option `GracefulNodeShutdownPodPolicy` to toggle setting pods to terminal phase during graceful node shutdown.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


smarterclayton
smarterclayton

Only default GracefulNodeShutdownPodPolicy to SetTerminal if Graceful Node Shutdown feature is enabled (i.e. ShutdownGracePeriod) in pkg/kubelet/apis/config/v1beta1/defaults.go. Otherwise the setting will be left unset, "".

Let me translate this to make sure we're all on the same page with possible configurations

  1. User specifies ShutdownGracePeriod but GracefulNodeShutdownPodPolicy is empty or null -> defaults to SetTerminal (preserves behavior for beta users)
  2. User specifies GracefulNodeShutdownPodPolicy but ShutdownGracePeriod is empty or zero -> default to LeaveRunning
  3. User specifies GracefulNodeShutdownPodPolicy and ShutdownGracePeriod -> no default necessary

If we did this, then I think users who move from 2 to 3 would be confused (they'd switch from LeaveRunning to SetTerminal, which is bad).

So I might argue that 2 should be:

  1. User specifies GracefulNodeShutdownPodPolicy but ShutdownGracePeriod is empty or zero -> validation error, user must explicitly specify ShutdownGracePeriod

Which then trains new users to make a choice (and we can say LeaveRunning is the default behavior for pods on kubelet shutdown when grace period is disabled or something).

pull request

smarterclayton merge to kubernetes/kubernetes

smarterclayton
smarterclayton

kubelet: Add GracefulNodeShutdownPodPolicy config option to graceful node shutdown

What type of PR is this?

/kind feature

What this PR does / why we need it:

Graceful node shutdown currently places pods into terminal phase upon shutdown. As discussed in https://github.com/kubernetes/kubernetes/issues/104531#issuecomment-982763592 and https://github.com/kubernetes/kubernetes/issues/104531#issuecomment-982766908 some users/distributions would benefit from the ability to toggle this behavior and instead to not put pods into terminal phase on shutdown.

For example, if it's expected that after shutdown the node will reboot, it may be desirable to not place the pods into terminal phase, so that way after reboot the pods will start running again after the node comes back up.

This PR adds a new kubelet configuration option to toggle this behavior GracefulNodeShutdownPodPolicy. The default setting of GracefulNodeShutdownPodPolicy is enabled SetTerminal (which matches current behavior since 1.20) when graceful node shutdown was introduced.

Which issue(s) this PR fixes:

Fixes #108991

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Add a new kubelet configuration option `GracefulNodeShutdownPodPolicy` to toggle setting pods to terminal phase during graceful node shutdown.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


open pull request

smarterclayton wants to merge kubernetes/kubernetes

smarterclayton
smarterclayton

csi: add nodeExpandSecret support for CSI client & add unit test

CSI spec 1.5 enhanced the spec to add optional secrets field to NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the CSI PV source and also derive the expansion secret in csiclient to send it out as part of the nodeexpand request.

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable: /kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #95367

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This release add support for NodeExpandSecret for CSI driver client which enables the CSI drivers to make use of this secret while performing node expansion operation based on the user request. Previously there was no secret  provided as part of the nodeexpansion call, thus CSI drivers were not make use of the same while expanding the volume at node side. 

KEP reference:https://github.com/kubernetes/enhancements/pull/3173/

smarterclayton
smarterclayton

Sorry for this - but we wouldn't use nil because this apidoc is intended for JSON users (I should have clarified). Now that I think you've answered the question of "the field is absent / omitted" I'd suggest this sentence (And all the others in this file, which can be done later) as:

This field is optional, may be omitted if no secret is required.

pull request

smarterclayton merge to kubernetes/kubernetes

smarterclayton
smarterclayton

csi: add nodeExpandSecret support for CSI client & add unit test

CSI spec 1.5 enhanced the spec to add optional secrets field to NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the CSI PV source and also derive the expansion secret in csiclient to send it out as part of the nodeexpand request.

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable: /kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #95367

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This release add support for NodeExpandSecret for CSI driver client which enables the CSI drivers to make use of this secret while performing node expansion operation based on the user request. Previously there was no secret  provided as part of the nodeexpansion call, thus CSI drivers were not make use of the same while expanding the volume at node side. 

KEP reference:https://github.com/kubernetes/enhancements/pull/3173/

Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

rest: Ensure response body is fully read and closed before retry

This commit refactors the retry logic to include resetting the request body. The reset logic will be called iff it is not the first attempt. This refactor is nescessary mainly because now as per the retry logic, we always ensure that the request body is reset after the response body is fully read and closed in order to reuse the same TCP connection.

Previously, the reset of the request body and the call to read and close the response body were not in the right order, which leads to race conditions.

xref https://github.com/kubernetes/kubernetes/issues/108906

Fixes a bug in our client retry logic that was not draining/closing the body prior to trying to reset the request body for a retry. According to go upstream, this is required to avoid races.

Does this PR introduce a user-facing change?

client-go: if resetting the body fails before a retry, an error is now surfaced to the user.

Continuing https://github.com/kubernetes/kubernetes/pull/109028

/priority critical-urgent /assign @aojea @liggitt @smarterclayton /cc @dims @tkashem would something like this work?

smarterclayton
smarterclayton

Do we have enough testing to be confident the race is now closed with the current changes? If so, I'll tag this now, otherwise I'll wait for that.

Mar
31
1 month ago
pull request

smarterclayton merge to kubernetes/kubernetes

smarterclayton
smarterclayton

rest: Ensure response body is fully read and closed before retry

This commit refactors the retry logic to include resetting the request body. The reset logic will be called iff it is not the first attempt. This refactor is nescessary mainly because now as per the retry logic, we always ensure that the request body is reset after the response body is fully read and closed in order to reuse the same TCP connection.

Previously, the reset of the request body and the call to read and close the response body were not in the right order, which leads to race conditions.

xref https://github.com/kubernetes/kubernetes/issues/108906

Fixes a bug in our client retry logic that was not draining/closing the body prior to trying to reset the request body for a retry. According to go upstream, this is required to avoid races.

Does this PR introduce a user-facing change?

client-go: if resetting the body fails before a retry, an error is now surfaced to the user.

Continuing https://github.com/kubernetes/kubernetes/pull/109028

/priority critical-urgent /assign @aojea @liggitt @smarterclayton /cc @dims @tkashem would something like this work?

open pull request

smarterclayton wants to merge kubernetes/kubernetes

smarterclayton
smarterclayton

rest: Ensure response body is fully read and closed before retry

This commit refactors the retry logic to include resetting the request body. The reset logic will be called iff it is not the first attempt. This refactor is nescessary mainly because now as per the retry logic, we always ensure that the request body is reset after the response body is fully read and closed in order to reuse the same TCP connection.

Previously, the reset of the request body and the call to read and close the response body were not in the right order, which leads to race conditions.

xref https://github.com/kubernetes/kubernetes/issues/108906

Fixes a bug in our client retry logic that was not draining/closing the body prior to trying to reset the request body for a retry. According to go upstream, this is required to avoid races.

Does this PR introduce a user-facing change?

client-go: if resetting the body fails before a retry, an error is now surfaced to the user.

Continuing https://github.com/kubernetes/kubernetes/pull/109028

/priority critical-urgent /assign @aojea @liggitt @smarterclayton /cc @dims @tkashem would something like this work?

smarterclayton
smarterclayton

I think we may need to assert that this is !apierrors.IsInternalError(err), or assert this is not a server error at all (possibly by checking what the type is after unwrapping).

open pull request

smarterclayton wants to merge kubernetes/kubernetes

smarterclayton
smarterclayton

pkg/storage/etcd3: correctly validate resourceVersions

pkg/storage/etcd3: correctly validate resourceVersions

In a number of tests, the underlying storage backend interaction will return the revision (logical clock underpinning the MVCC implementation) at the call-time of the RPC. Previously, the tests validated that this returned revision was exactly equal to some previously seen revision. This assertion is only true in systems where no other events are advancing the logical clock. For instance, when using a single etcd cluster as a shared fixture for these tests, the assertion is not valid any longer. By checking that the returned revision is no older than the previously seen revision, the validation logic is correct in all cases.

Signed-off-by: Steve Kuznetsov [email protected]


Depends on https://github.com/kubernetes/kubernetes/pull/108936

/kind cleanup

NONE

/sig api-machinery /assign @liggitt @smarterclayton @sttts @deads2k

smarterclayton
smarterclayton

I could see three possibilities:

  1. etcd adds a background write that can happen at any time -> we would need > sematnics in test
  2. we change etcd storage to do GETS in some cases -> we would change the test to capture that two operations happened (so + 2)
  3. kine wanted to leverage these tests, and the kine impl doesn't really need to guarantee incremental RVs -> we generalize these tests, and the minimum verification is > but we might want the generalized tests to be parameterized to describe the exact relationship (or we simply duplicate the tests via good old cut and paste)

I assume here steve you'd like to generalize these tests, but wanted to be sure?

pull request

smarterclayton merge to kubernetes/kubernetes

smarterclayton
smarterclayton

pkg/storage/etcd3: correctly validate resourceVersions

pkg/storage/etcd3: correctly validate resourceVersions

In a number of tests, the underlying storage backend interaction will return the revision (logical clock underpinning the MVCC implementation) at the call-time of the RPC. Previously, the tests validated that this returned revision was exactly equal to some previously seen revision. This assertion is only true in systems where no other events are advancing the logical clock. For instance, when using a single etcd cluster as a shared fixture for these tests, the assertion is not valid any longer. By checking that the returned revision is no older than the previously seen revision, the validation logic is correct in all cases.

Signed-off-by: Steve Kuznetsov [email protected]


Depends on https://github.com/kubernetes/kubernetes/pull/108936

/kind cleanup

NONE

/sig api-machinery /assign @liggitt @smarterclayton @sttts @deads2k

Activity icon
issue

smarterclayton issue comment kubernetes/kubernetes

smarterclayton
smarterclayton

pkg/storage/etcd3: correctly validate resourceVersions

pkg/storage/etcd3: correctly validate resourceVersions

In a number of tests, the underlying storage backend interaction will return the revision (logical clock underpinning the MVCC implementation) at the call-time of the RPC. Previously, the tests validated that this returned revision was exactly equal to some previously seen revision. This assertion is only true in systems where no other events are advancing the logical clock. For instance, when using a single etcd cluster as a shared fixture for these tests, the assertion is not valid any longer. By checking that the returned revision is no older than the previously seen revision, the validation logic is correct in all cases.

Signed-off-by: Steve Kuznetsov [email protected]


Depends on https://github.com/kubernetes/kubernetes/pull/108936

/kind cleanup

NONE

/sig api-machinery /assign @liggitt @smarterclayton @sttts @deads2k

smarterclayton
smarterclayton

Would kine be better testable if this was fixed? Is there any real disadvantage to weakening this assumption (such as compaction / future etcd changes that lead to potentially background changes)?

open pull request

smarterclayton wants to merge kubernetes/kubernetes

smarterclayton
smarterclayton

csi: add nodeExpandSecret support for CSI client & add unit test

CSI spec 1.5 enhanced the spec to add optional secrets field to NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the CSI PV source and also derive the expansion secret in csiclient to send it out as part of the nodeexpand request.

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable: /kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #95367

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This release add support for NodeExpandSecret for CSI driver client which enables the CSI drivers to make use of this secret while performing node expansion operation based on the user request. Previously there was no secret  provided as part of the nodeexpansion call, thus CSI drivers were not make use of the same while expanding the volume at node side. 

KEP reference:https://github.com/kubernetes/enhancements/pull/3173/

smarterclayton
smarterclayton

To clarify - empty in the context of a client in a JSON API would generally mean "nodeExpandSecretRef": {} which is definitely not allowed by validation.

pull request

smarterclayton merge to kubernetes/kubernetes

smarterclayton
smarterclayton

csi: add nodeExpandSecret support for CSI client & add unit test

CSI spec 1.5 enhanced the spec to add optional secrets field to NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the CSI PV source and also derive the expansion secret in csiclient to send it out as part of the nodeexpand request.

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable: /kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #95367

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This release add support for NodeExpandSecret for CSI driver client which enables the CSI drivers to make use of this secret while performing node expansion operation based on the user request. Previously there was no secret  provided as part of the nodeexpansion call, thus CSI drivers were not make use of the same while expanding the volume at node side. 

KEP reference:https://github.com/kubernetes/enhancements/pull/3173/

open pull request

smarterclayton wants to merge kubernetes/kubernetes

smarterclayton
smarterclayton

csi: add nodeExpandSecret support for CSI client & add unit test

CSI spec 1.5 enhanced the spec to add optional secrets field to NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the CSI PV source and also derive the expansion secret in csiclient to send it out as part of the nodeexpand request.

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable: /kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #95367

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This release add support for NodeExpandSecret for CSI driver client which enables the CSI drivers to make use of this secret while performing node expansion operation based on the user request. Previously there was no secret  provided as part of the nodeexpansion call, thus CSI drivers were not make use of the same while expanding the volume at node side. 

KEP reference:https://github.com/kubernetes/enhancements/pull/3173/

smarterclayton
smarterclayton

(I realize we're just copying the other fields, but it doesn't make sense to me as an API user what you mean by empty, so I'm trying to understand what it means so we can decide on a follow up)

pull request

smarterclayton merge to kubernetes/kubernetes

smarterclayton
smarterclayton

csi: add nodeExpandSecret support for CSI client & add unit test

CSI spec 1.5 enhanced the spec to add optional secrets field to NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the CSI PV source and also derive the expansion secret in csiclient to send it out as part of the nodeexpand request.

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable: /kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #95367

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This release add support for NodeExpandSecret for CSI driver client which enables the CSI drivers to make use of this secret while performing node expansion operation based on the user request. Previously there was no secret  provided as part of the nodeexpansion call, thus CSI drivers were not make use of the same while expanding the volume at node side. 

KEP reference:https://github.com/kubernetes/enhancements/pull/3173/

open pull request

smarterclayton wants to merge kubernetes/kubernetes

smarterclayton
smarterclayton

csi: add nodeExpandSecret support for CSI client & add unit test

CSI spec 1.5 enhanced the spec to add optional secrets field to NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the CSI PV source and also derive the expansion secret in csiclient to send it out as part of the nodeexpand request.

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable: /kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #95367

Special notes for your reviewer:

Does this PR introduce a user-facing change?

This release add support for NodeExpandSecret for CSI driver client which enables the CSI drivers to make use of this secret while performing node expansion operation based on the user request. Previously there was no secret  provided as part of the nodeexpansion call, thus CSI drivers were not make use of the same while expanding the volume at node side. 

KEP reference:https://github.com/kubernetes/enhancements/pull/3173/

smarterclayton
smarterclayton

may be empty reads weird. What do you mean by it?

Previous