Need
When deploying a Helm chart, you might encounter an error message such as:
Error: failed post-install: 1 error occurred: * job <job_name> failed: BackoffLimitExceeded
or
Error: failed post-install: 1 error occurred: * timed out waiting for the condition
Unfortunately, this message alone does not provide enough detail to diagnose the root cause. To troubleshoot effectively, we need to collect more detailed information from the Kubernetes cluster — specifically from the pods involved in the failed job.
Summarized Solution
To investigate the issue, you can:
List all pods in the target namespace to identify which ones might be failing.
kubectl get pods -n <namespace>
Describe and get logs from the problematic pods to see what went wrong.
kubectl describe pod <pod> -n <namespace> kubectl logs <pod> -n <namespace>
Detailed Solution
1- Identify the namespace and list pods
Run the following command to list all pods in the namespace where your Helm release was deployed:
kubectl get pods -n <namespace>
This will display the status of each pod.
Look for pods with a Status such as Error, CrashLoopBackOff, or ImagePullBackOff.
Example output:
NAME READY STATUS RESTARTS AGE <pod1> 0/1 Error 1 2m <pod2> 1/1 Running 0 5m
2- Describe the failing pods
To understand what caused the failure, describe the problematic pod:
kubectl describe pod <pod> -n <namespace>
This command shows detailed information about what the pod is trying to do.
Example 'Event' section:
Events: Type Reason Age From Message Warning Unhealthy 39m (x9 over 42m) kubelet Readiness probe failed: Get "http://10.244.0.17:5000/auth/realms/master": dial tcp 10.244.0.17:5000: connect: connection refused Warning Unhealthy 39m (x6 over 41m) kubelet Liveness probe failed: dial tcp 10.244.0.17:5000: connect: connection refused Normal Killing 39m kubelet Container keycloak failed liveness probe, will be restarted
3- Retrieve the logs
Next, get the container logs to see what happened right before the failure:
kubectl logs <pod> -n <namespace> --all-containers=true
Optional: Use the automation script
To accelerate the process, you can use the homemade shell script that automates the collection of the logs:
#!/bin/sh
# Script to collect pod descriptions and logs for one or all namespaces
# Organized into a timestamped main folder with per-namespace subfolders
# Check for help options first
if [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
echo "Usage: $0 [OPTIONS] [NAMESPACE...]"
echo
echo "Collects pod descriptions and logs for some or all namespaces."
echo "Data is saved in a timestamped main folder with per-namespace subfolders."
echo
echo "Options:"
echo " -h, --help Show this help message and exit."
echo " [NAMESPACE] Specify one or more namespaces, separated by space, to collect data from."
echo " If no namespaces are specified, data for all namespaces is gathered."
echo
exit 0
fi
# set -u after option -h/--help in case of $1 is missing
set -eu
# Generate main timestamped folder
main_folder="pod_details_$(date +"%Y%m%d_%H%M%S")"
mkdir -p "$main_folder"
# Determine target namespaces
# If no argument provided: loop on all namespaces
if [ "$#" -eq 0 ]; then
echo "No namespace provided. Gathering data for all namespaces..."
namespaces=$(kubectl get ns --no-headers | awk '{print $1}')
else
# If one or more namespace(s) is/are provided, use it/them
namespaces="$@"
echo "Namespaces provided. Gathering data only for namespaces: $namespaces"
fi
# Loop over each namespace
for ns in $namespaces; do
echo "Checking namespace: $ns"
# Get pods in the namespace
pod_list=$(kubectl get pod -n "$ns" --no-headers 2>/dev/null | awk '{print $1}')
# Skip if no pods found
if [ -z "$pod_list" ]; then
echo " No pods found in namespace: $ns — skipping."
continue
fi
echo " Pods found in namespace: $ns — collecting logs."
# Create subfolder for the namespace
ns_folder="${main_folder}/${ns}"
mkdir -p "$ns_folder"
# Save pod list to file and extract pod names
kubectl get pod -o wide -n "$ns" > "$ns_folder/pods.txt"
# Loop through pods
for pod in $pod_list; do
(
echo " Processing pod: $pod"
kubectl describe pod "$pod" -n "$ns" > "$ns_folder/${pod}_describe.txt"
log_file="$ns_folder/${pod}_logs.log"
# remove ANSI color, replace \n with new lines and and \t with tabs, remove escape for slashes /
kubectl logs "$pod" -n "$ns" --all-containers=true 2>/dev/null \
| sed 's/\x1b\[[0-9;]*m//g' | sed 's/\\n/\n/g' | sed 's/\\t/\t/g' | sed 's/\\\//\//g' > "$log_file"
# Remove the file if it's empty
if [ ! -s "$log_file" ]; then
echo " No logs for pod: $pod"
rm -f "$log_file"
fi
) &
done
wait
echo " Finished namespace: $ns"
done
echo "All available pod details and logs saved in: $main_folder"
Copy the script into a file named poddetails.sh, then make it executable and retrieve the details of the namespace :
chmod +x poddetails.sh ./poddetails.sh <namespace>