Author: Julien Peltier

Semarchy Diagnostic install chart


Need

The purpose of this chart is to verify that the installation of the Semarchy Data Platform (SDP) in a self-hosted environment has been completed successfully.

It collects logs, secrets, job statuses, and cluster resources using kubectl support-bundle, then analyzes them to confirm that all required components are healthy and correctly provisioned.

This diagnostic tool should always be executed after the Helm installation of SDP.


Detailed Solution

Usage

To run this chart, first change your kubectl context to the desired target

kubectl config use-context <desired-context>


When to run

  • Prerequisites

First, install SDP in your cluster:

helm install <sdp-release-name> -n <sdp-namespace> -f values.yaml

Where <sdp-namespace> → the Kubernetes namespace where SDP is installed.
Save the namespace because as this is required for the diagnostic step.

Using the diagnostic tooling

Follow those steps:

  • Execute the following commands

Linux:

export NAMESPACE=<sdp-namespace>
export CJ=$(kubectl get cronjob -n "$NAMESPACE" -l k8s.semarchy.net/diagnostic-install-job=true -o jsonpath='{.items[0].metadata.name}')
kubectl create job -n $NAMESPACE --from=cronjob/$CJ semarchy-diagnostic-install-job

Windows:

set NAMESPACE=<sdp-namespace>
for /f "tokens=2 delims=/" %i in ('kubectl get cronjob -n "%NAMESPACE%" -l "k8s.semarchy.net/diagnostic-install-job=true" -o name') do @set "CJ=%i"
kubectl create job -n %NAMESPACE% --from=cronjob/%CJ% semarchy-diagnostic-install-job
  • Wait for completion of the job

Linux:

export JOB=$(kubectl get job -n "$NAMESPACE" -l k8s.semarchy.net/diagnostic-install-job=true -o jsonpath='{.items[0].metadata.name}')
kubectl wait --for=condition=complete job/$JOB -n $NAMESPACE --timeout 10m


Windows:

for /f "tokens=2 delims=/" %i in ('kubectl get job -n "%NAMESPACE%" -l "k8s.semarchy.net/diagnostic-install-job=true" -o name') do @set "JOB=%i"
kubectl wait --for=condition=complete job/%JOB% -n %NAMESPACE% --timeout 10m
  • When the job is finished, you can delete the diagnostic on your laptop

Linux:

export POD=$(kubectl get pod -n $NAMESPACE -l k8s.semarchy.net/diagnostic-sidecar=true -o jsonpath='{.items[0].metadata.name}')
export LATEST=$(kubectl exec -n $NAMESPACE "$POD" -- sh -c 'ls -1t /diagnostic-volume/support-bundle*.tar.gz 2>/dev/null | head -n1')
kubectl cp -n $NAMESPACE "$POD:$LATEST" ./support-bundle.tar.gz

Windows:

for /f "tokens=2 delims=/" %i in ('kubectl get pod -n "%NAMESPACE%" -l "k8s.semarchy.net/diagnostic-sidecar=true" -o name') do @set "POD=%i"
for /f "delims=" %i in ('kubectl exec -n "%NAMESPACE%" "%POD%" -- sh -c "ls -1t /diagnostic-volume/support-bundle*.tar.gz 2>/dev/null ^| head -n1"') do @set "LATEST=%i"
kubectl cp -n %NAMESPACE% "%POD%:%LATEST%" ./support-bundle.tar.gz


Data collected

This chart is conceived to retrieve the following diagnostic data:

Category

What is Collected

Cluster Resources

All resources within the release namespace.

Jobs (logs)

Logs (up to 20,000 lines) from provisioning and setup jobs from sdp chart, such as: log-explorer-service-setup, and others.

StatefulSets (logs)

Logs from Keycloak pods, across all containers, with timestamps and prefixes.

Deployments (logs)

Logs from deployments such as: billing-servicedm-core-activedm-core-passivelog-explorerlog-explorer-servicesite-adminuser-profilewelcome.

Helm Values

Values of the semarchy-data-platform chart in the namespace.

Secrets

Selected secrets managed by SDP chart

RunPod Checks

Test connections to Postgres, and API queries against Keycloak realms and clients using temporary pods.

Analyzers

· Kubernetes version validation (>= 1.26)

· Secret presence checks (Keycloak, OpenSearch, Postgres, Kafka, etc.)

· Job completion checks for setup/provisioning jobs

· Pod health checks (CrashLoopBackOffImagePullBackOffPending, etc.)


Bundle repository tree

When uncompressed, the directory structure will look like the following:

.

├── analysis.json -> contain result of all checks

├── cluster-info

│   └── cluster_version.json

├── cluster-resources

│   ├── auth-cani-list

│   ├── clusterrolebindings.json

│   ├── clusterroles.json

│   ├── configmaps

│   ├── cronjobs-errors.json

│   ├── custom-resource-definitions-errors.json

│   ├── custom-resource-definitions.json

│   ├── custom-resources

│   ├── daemonsets

│   ├── deployments

│   ├── endpoints

│   ├── endpointslices

│   ├── events

│   ├── groups-resources-errors.json

│   ├── groups.json

│   ├── ingress-errors.json

│   ├── jobs

│   ├── leases

│   ├── limitranges

│   ├── namespaces.json

│   ├── network-policy

│   ├── nodes.json

│   ├── pod-disruption-budgets-errors.json

│   ├── pods

│   ├── priorityclasses-errors.json

│   ├── priorityclasses.json

│   ├── pvcs

│   ├── pvs.json

│   ├── resource-quota

│   ├── resources.json

│   ├── rolebindings

│   ├── roles

│   ├── serviceaccounts

│   ├── services

│   ├── statefulsets

│   ├── statefulsets-errors.json

│   ├── storage-classes-errors.json

│   ├── storage-classes.json

│   └── volumeattachments.json

├── execution-data

│   └── summary.txt

├── helm

│   └── semarchy.json -> contain all information of installed chart

├── keycloak-clients

│   ├── keycloak-clients-events.json

│   ├── keycloak-clients.json

│   └── keycloak-clients.log -> check logs to get list of client

├── keycloak-realms

│   ├── keycloak-realms-events.json

│   ├── keycloak-realms.json

│   └── keycloak-realms.log -> check logs to get list of realms

├── logs

│   └── pods -> Contain logs of all pod of specified namespace

├── secrets

│   └── semarchy -> Contain secret namaged by sdp chart

├── test-postgres-connection

│   ├── test-postgres-connection-events.json

│   ├── test-postgres-connection.json

│   └── test-postgres-connection.log -> check logs to see result

└── version.yaml