On a K8s cluster, there exists multiple sources of failures: i) accidental - such as hardware malfunctions, a cluster VM being accidentally removed, a kernel panic, as well as pod eviction resulting from hard memory pressure and ii) voluntary disruption caused by pod being deleted, updated or scaled. For unvoluntary failures typical counteractions at node level are draining and cordoning (see here), i..e. all pods being killed and moved to other ones and the node being removed temporarily from the cluster schedulable worker set. To mitigate involuntary failures, setting a service on high-availability mode (HA) is always a good idea, along with spatial distribution across multiple data centers and regions.
A Pod disruption budget is a mechanism meant to specify how many instances can be simultaneously down due to volontary disruption, such as during a rollout, where we want to guarantee at least p% instances are up and running while (1-p)% can be modified for maintenance. Specifically, the minAvailable and maxUnavailable values can be used for the purpose, referring to a pod number when defined as integer values or as percentage if defined as a string otherwise (e.g. "20%"). A selector is also required in the specification as it is the way to assign the pdb to a specific set of pods (such as those created by a Deployment, ReplicaSet, StatefulSet).
Cheers
No comments:
Post a Comment