Machine Health Checks

EKS Anywhere cluster yaml specification for machine health check configuration

Machine Health Checks Support

Provider support details

vSphere Bare Metal Nutanix CloudStack Snow
Supported?

You can configure EKS Anywhere to specify timeouts for machine health checks. A Machine Health Check is a resource which allows users to define conditions under which Machines within a Cluster should be considered unhealthy. A Machine Health Check is defined on a management cluster and scoped to a particular workload cluster. If not configured in the spec, the default values are used to configure the machine health checks.

Note: Even though the configuration on machine health check timeouts in the EKSA spec is optional, machine health checks are still installed for all clusters using the default timeout values mentioned below.

The following cluster spec shows an example of how to configure health check timeouts:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name
spec:
   ...
  machineHealthCheck:
    nodeStartupTimeout: "10m0s"
    unhealthyMachineTimeout: "5m0s"

Machine Health Check Spec Details

machineHealthCheck (optional)

  • Description: top level key; required to configure machine health check timeouts.
  • Type: object

nodeStartupTimeout (optional)

  • Description: determines how long a Machine Health Check should wait for a Node to join the cluster, before considering a Machine unhealthy.
  • Default: 20m0s for Tinkerbell provider, 10m0s for all other providers.
  • Minimum Value (If configured): 30s
  • Type: string

unhealthyMachineTimeout (optional)

  • Description: if the unhealthy condition is matched for the duration of this timeout, the Machine is considered unhealthy.
  • Default: 5m0s
  • Type: string