Problem#

  • API requests returning 503 Service Unavailable
  • Istio logs showing: no_healthy_upstream
  • Pods are in Running state but service routing is broken

Root Cause#

Mismatch between Service selector labels and Pod labels

# Check endpoints - <none> means something is wrong!
kubectl -n staging-webs get endpoints api-gateway
NAME          ENDPOINTS   AGE
api-gateway   <none>      34h   # ← no Pod IPs!

How It Works#

┌─────────────────────────────────────────────────────────┐
│  Service (api-gateway)                                  │
│  selector:                                              │
│    app: staging-api-gateway                             │
│    app.kubernetes.io/part-of: staging-webs  ← required! │
└─────────────────────────────────────────────────────────┘
                         │
                         ▼ find Pods with matching selector labels
┌─────────────────────────────────────────────────────────┐
│  Endpoints (auto-generated)                             │
│  addresses:                                             │
│    - 10.0.20.100 (Pod1 IP)                              │
│    - 10.0.20.101 (Pod2 IP)                              │
└─────────────────────────────────────────────────────────┘
                         │
                         ▼ load balance traffic
┌─────────────────────────────────────────────────────────┐
│  Pod (must have labels that match the selector)         │
│  labels:                                                │
│    app: staging-api-gateway           ✓                 │
│    app.kubernetes.io/part-of: staging-webs  ✓           │
└─────────────────────────────────────────────────────────┘

Key point: All labels in the Service selector must be present on the Pod for it to be registered in Endpoints

Diagnosis#

1. Check Endpoints#

kubectl -n <namespace> get endpoints <service-name>

# Normal: IPs present
NAME          ENDPOINTS         AGE
api-gateway   10.0.20.100:8085  34h

# Problem: <none>
NAME          ENDPOINTS   AGE
api-gateway   <none>      34h

2. Check Service Selector#

kubectl -n <namespace> get svc <service-name> -o yaml | grep -A 10 "selector:"

3. Check Pod Labels#

kubectl -n <namespace> get pod <pod-name> --show-labels

4. Compare Labels#

# Service selector
selector:
  app: staging-api-gateway
  app.kubernetes.io/component: backend
  app.kubernetes.io/instance: staging-api-gateway
  app.kubernetes.io/name: java-service
  app.kubernetes.io/part-of: staging-webs   # ← check if this exists!

# Pod labels
app=staging-api-gateway                      ✓
app.kubernetes.io/component=backend          ✓
app.kubernetes.io/instance=staging-api-gateway ✓
app.kubernetes.io/name=java-service          ✓
app.kubernetes.io/part-of=???                ✗ ← missing = match failure!

Fix#

Option 1: Redeploy app (ArgoCD sync)#

# ArgoCD CLI
argocd app sync <app-name>

# Trigger sync via kubectl
kubectl -n argocd patch application <app-name> \
  --type merge \
  -p '{"operation":{"initiatedBy":{"username":"admin"},"sync":{}}}'

Option 2: Manually add label (temporary fix)#

kubectl -n <namespace> label pod <pod-name> app.kubernetes.io/part-of=staging-webs

Option 3: Patch the Deployment labels#

kubectl -n <namespace> patch deployment <deployment-name> \
  --type merge \
  -p '{"spec":{"template":{"metadata":{"labels":{"app.kubernetes.io/part-of":"staging-webs"}}}}}'

Prevention#

  1. Keep selectorLabels consistent in Helm charts

    • Service selector and Deployment pod template labels must match exactly
  2. selectorLabels are immutable

    • Best to set them once and not change them
    • If changed, both the Service and Deployment must be updated
  3. Be careful with ArgoCD

    • After modifying a chart, sync all related apps
    • Don’t leave apps in OutOfSync state

Quick Diagnostic Commands#

# Full diagnosis in one go
NS=staging-webs
SVC=api-gateway

echo "=== Endpoints ==="
kubectl -n $NS get endpoints $SVC

echo "=== Service Selector ==="
kubectl -n $NS get svc $SVC -o jsonpath='{.spec.selector}' | jq .

echo "=== Pod Labels ==="
kubectl -n $NS get pods -l app=$SVC --show-labels

echo "=== Istio logs (recent errors) ==="
kubectl -n istio-system logs -l app=istio-ingressgateway --tail=10 | grep "no_healthy"

References#