Posted 7 months ago

If a new canary rollout pods are stuck in an error state (e.g. image pull back off), the rollout degrades after the progress deadline is exceeded but the bad replica set/pods aren't scaled down.

Steps to reproduce:

  1. Create a rollout with canary strategy, and have a stable version running.
  2. Create a new revision of this rollout with a bad image tag (this would be the canary version).
  3. Wait for the rollout to progress.

The pods would keep retrying and eventually the rollout message would read ProgressDeadlineExceeded: The replicaset <name> has timed out progressing.

with rollout status being Degraded. However, the canary replicaset would still show up as Progressing.

The expectation is that the canary replica set would be scaled down as well.


      scaleDownDelaySeconds: 180
      - setCanaryScale:
          weight: 100
      - setWeight: 100
      - pause:
          duration: 60s
      - analysis:
          - templateName: istio-shipping-bg-fit
            canarySubsetName: preview
            name: istio-shipping-bg
            stableSubsetName: stable

kubectl argo rollouts get rollouts istio-shipping-bg :

Name:            istio-shipping-bg
Namespace:       service-mesh
Status:          ✖ Degraded
Message:         ProgressDeadlineExceeded: ReplicaSet "istio-shipping-bg-587fd74d5b" has timed out progressing.
Strategy:        Canary
  Step:          0/4
  SetWeight:     0
  ActualWeight:  0
Images:         <goodimagename> (stable)
                <badimagename> (canary)
  Desired:       4
  Current:       8
  Updated:       4
  Ready:         4
  Available:     4

NAME                                                                       KIND         STATUS              AGE    INFO
⟳ istio-shipping-bg                                                        Rollout      ✖ Degraded          40d    
├──# revision:33                                                                                                   
│  └──⧉ istio-shipping-bg-587fd74d5b                                       ReplicaSet   ◌ Progressing       6d2h   canary
│     ├──□ istio-shipping-bg-587fd74d5b-vc7nl                              Pod          ⚠ ImagePullBackOff  6d2h   ready:3/4
│     ├──□ istio-shipping-bg-587fd74d5b-g8pw7                              Pod          ⚠ ImagePullBackOff  5d2h   ready:3/4
│     ├──□ istio-shipping-bg-587fd74d5b-qzgf4                              Pod          ⚠ ImagePullBackOff  4d19h  ready:3/4
│     └──□ istio-shipping-bg-587fd74d5b-xst4n                              Pod          ⚠ ImagePullBackOff  3d3h   ready:3/4
├──# revision:32                                                                                                   
│  ├──⧉ istio-shipping-bg-84f758b689                                       ReplicaSet   ✔ Healthy           6d3h   stable
│  │  ├──□ istio-shipping-bg-84f758b689-pdgxb                              Pod          ✔ Running           6d2h   ready:4/4
│  │  ├──□ istio-shipping-bg-84f758b689-dwg97                              Pod          ✔ Running           5d2h   ready:4/4
│  │  ├──□ istio-shipping-bg-84f758b689-tdmhr                              Pod          ✔ Running           4d19h  ready:4/4
│  │  └──□ istio-shipping-bg-84f758b689-7cl97                              Pod          ✔ Running           3d3h   ready:4/4
│  ├──α istio-shipping-bg-84f758b689-32-3                                  AnalysisRun  ✖ Failed            6d3h   ✖ 1
│  │  └──⊞ ecf3a240-842a-4c19-9b33-3349c0d34d80.istio-shipping-bg-sleep.1  Job          ✖ Failed            6d3h   
│  └──α istio-shipping-bg-84f758b689-32-3.1                                AnalysisRun  ✔ Successful        6d2h   ✔ 1
│     └──⊞ 0a3e6115-9527-4e4f-a7a9-98c02ede767b.istio-shipping-bg-fit.1    Job          ✔ Successful        6d2h   
├──# revision:31                                                                                                   
│  ├──⧉ istio-shipping-bg-749f699c88                                       ReplicaSet   • ScaledDown        6d3h   delay:passed
│  └──α istio-shipping-bg-749f699c88-31-3                                  AnalysisRun  ⚠ Error             6d3h   ⚠ 5

