A Kubernetes operator for running synthetic checks as pods. Works great with Prometheus!
The Pod Status Check
checks for pods older than ten minutes and are in an unhealthy lifecycle phase. If a
podStatusCheck
detects that a pod is down, an alert is shown on the status page. When a pod is found to be in error,
the exact pod’s name will be shown as one of the Error
field’s strings.
apiVersion: comcast.github.io/v1
kind: KuberhealthyCheck
metadata:
name: pod-status
namespace: kuberhealthy
spec:
runInterval: 5m
timeout: 15m
podSpec:
containers:
- env:
- name: SKIP_DURATION # the duration of time that pods are ignored for after being created
value: "10m"
- name: TARGET_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
image: kuberhealthy/pod-status-check:v1.3.0
imagePullPolicy: IfNotPresent
name: main
resources:
requests:
cpu: 10m
memory: 50Mi
Phases that this check considers healthy
Phases that this check considers unhealthy
Note: This check assumes that a pod is unhealthy if it is over 10 minutes old and still Pending.
By default, Pod Status Check
will check pods in the same namespace it is installed into. This means the RBAC requirements for the service account the check runs with can be limited to a single namespace scope.
It is possible to configure Pod Status Check
to check pods from all namespaces in a cluster, this requires cluster wide permissions for the service account and is not recommended for multi-tenant setups.
To implement the Pod Status Check with Kuberhealthy, apply the configuration file pod-status-check.yaml
kubectl apply -f https://raw.githubusercontent.com/kuberhealthy/kuberhealthy/2.0.0/cmd/pod-status-check/pod-status-check.yaml
to your Kubernetes Cluster.
If you want to enable the cluster wide option described above then instead apply with cluster permissions pod-status-check-clusterscope.yaml.
helm repo add kuberhealthy https://comcast.github.io/kuberhealthy/helm-repos
helm install kuberhealthy kuberhealthy/kuberhealthy --set check.podStatus.enabled=true
To enable cluster wide check with cluster permissions
helm install kuberhealthy kuberhealthy/kuberhealthy --set check.podStatus.enabled=true --set check.podStatus.allNamespaces=true