Skip to content

Remediation: Grafana Dashboard Empty, Prometheus Scrape Blocked by NetworkPolicy

Immediate Fix (Networking — Domain C)

The fix is to add a NetworkPolicy rule allowing Prometheus to scrape the metrics port.

Step 1: Add a NetworkPolicy allowing metrics scraping

$ kubectl apply -f - <<'EOF'
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payments-allow-prometheus-scrape
  namespace: payments
  annotations:
    author: platform-team
    ticket: SEC-5012-fix
spec:
  podSelector: {}
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: monitoring
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: prometheus
      ports:
        - protocol: TCP
          port: 9090
EOF
networkpolicy.networking.k8s.io/payments-allow-prometheus-scrape created

Step 2: Also allow egress for DNS (required for service discovery)

$ kubectl apply -f - <<'EOF'
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payments-allow-dns
  namespace: payments
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53
EOF

Step 3: Wait for Prometheus to rescrape

Prometheus scrape interval is typically 15-30 seconds. Within one scrape cycle, targets should come back up.

Verification

Domain A (Observability) — Dashboards populated

$ curl -s http://prometheus.monitoring.svc:9090/api/v1/targets | \
    jq '[.data.activeTargets[] | select(.labels.namespace=="payments")] | length'
6

$ curl -s http://prometheus.monitoring.svc:9090/api/v1/targets | \
    jq '.data.activeTargets[] | select(.labels.namespace=="payments") | .health' | sort | uniq -c
6 "up"

All 6 targets in the payments namespace are up. Grafana dashboards show data again.

Domain B (Kubernetes) — NetworkPolicies correct

$ kubectl get networkpolicy -n payments
NAME                                POD-SELECTOR       AGE
payments-allow-dns                  <none>             2m
payments-allow-ingress              app=payment-svc    50m
payments-allow-prometheus-scrape    <none>             3m
payments-default-deny               <none>             50m

Domain C (Networking) — Metrics port reachable cross-namespace

$ kubectl run debug --rm -it --image=curlimages/curl -n monitoring -- \
    curl -s --connect-timeout 5 http://10.244.3.42:9090/metrics | head -3
# TYPE payment_requests_total counter
payment_requests_total{method="POST",status="200"} 49102

Prevention

  • Monitoring: Add an alert for Prometheus target health by namespace. Fire WARNING when all targets in a namespace go down simultaneously.
- alert: AllTargetsDownInNamespace
  expr: |
    count by (namespace) (up{namespace=~"payments|orders|users"} == 0)
    ==
    count by (namespace) (up{namespace=~"payments|orders|users"})
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "All Prometheus targets in {{ $labels.namespace }} are down"
  • Runbook: Every namespace hardening initiative must include a checklist of ports to allow: application port, metrics port (9090), health check port, and DNS egress. Use a standard NetworkPolicy template that includes all required ports.

  • Architecture: Create a shared NetworkPolicy template or Helm helper that always includes the metrics port allowance. Use a policy-as-code tool like OPA/Gatekeeper to enforce that any default-deny policy must be accompanied by a metrics scrape allow rule.