Pages

Nov 1, 2018

Short-term fix for the watch stream error

A couple of weeks ago, some Azure Kubernetes Service (AKS) users started seeing watch stream errors in their clusters.

Symptoms

Pods using the in-cluster configuration to perform a watch on a resource see intermittent timeouts and the following error in the pod log.

streamwatcher.go:109] Unable to decode an event from the watch stream: stream error: stream ID 1; INTERNAL_ERROR

If the client that's performing the watch isn't handling errors gracefully, applications can get into an inconsistent state. Affected applications include, but are not limited to, nginx-ingress and tiller (helm).

A specific manifestation of this bug is the following error when attempting a helm deployment.

Error: watch closed before Until timeout

Root cause

  • The configuration of azureproxy results in unexpected timeouts for outbound watches targeting kubernetes.svc.cluster.local.
  • On the client side, the client-go implementation of watch.Until does not handle intermittent network failures gracefully. 

Upcoming fix

We're rolling out a short-term fix that will create a mutating webhook admission controller that injects the provided FQDN for you to fix the issue. Rollout of the fix will be complete in all regions by November 9, 2018. The fix will be applicable to both existing and new clusters.


by via Azure service updates

QuickBooks Self-Employed

Bigger tax refunds. Better organization. Manage your deductions with QuickBooks Self-Employed .