Quiz: GCP Troubleshooting¶
10 questions
L1 (6 questions)¶
1. A GCE instance can't reach the internet. What do you check?
Show answer
1. Instance has an external IP or Cloud NAT is configured for the subnet.2. VPC firewall rules allow egress (default allows all egress, but custom rules may deny).
3. Routes exist for 0.0.0.0/0 to the internet gateway.
4. The instance's service account has no org policy blocking internet access.
2. What is the difference between GCP firewall rules and Cloud Armor?
Show answer
Firewall rules: VPC-level, filter by IP/port/protocol/tags/service accounts, apply to instances. Cloud Armor: WAF/DDoS protection at the load balancer layer, supports geo-blocking, rate limiting, and OWASP rules. Use firewall rules for network segmentation; Cloud Armor for application-layer protection.3. How do you debug IAM permission issues in GCP?
Show answer
1. Use Policy Troubleshooter in Cloud Console (explains allow/deny).2. Check IAM bindings at project, folder, and org levels.
3. Look for deny policies (newer feature, explicit denies).
4. Verify the correct service account is being used.
5. Check org policies that may restrict permissions.
4. A GCE instance starts but SSH fails with 'Connection refused'. What do you check?
Show answer
1. Firewall rule allows tcp:22 from your IP (default-allow-ssh may have been deleted).2. Instance has an external IP or you're using IAP tunnel.
3. sshd is running (check serial console output).
4. OS Login is configured and your user has the login role.
5. Disk is not full (prevents sshd from accepting sessions).
5. What is Cloud NAT and when do you need it?
Show answer
Cloud NAT provides outbound internet access for instances without external IPs. It's a managed NAT gateway at the VPC level. You need it when instances in private subnets must reach external APIs, package repos, or services but should not be directly reachable from the internet.6. How do you investigate failed requests in GCP?
Show answer
1. Cloud Logging: filter by resource type, severity, and time range.2. Error Reporting: auto-groups exceptions from supported languages.
3. Cloud Trace: distributed tracing for latency analysis.
4. Cloud Monitoring: metrics dashboards and alerting.
5. Load balancer logs for HTTP errors.
L2 (4 questions)¶
1. A GKE pod gets Permission Denied when calling a GCP API. What do you check?
Show answer
1. Workload Identity is configured — pod's KSA is bound to a GSA.2. The GSA has the required IAM roles.
3. The KSA annotation matches the GSA email exactly.
4. Workload Identity pool is enabled on the cluster.
5. The node pool has the correct scope (if not using Workload Identity, check node SA).
2. Cloud SQL connectivity fails from a GKE cluster. What do you investigate?
Show answer
1. Cloud SQL Auth Proxy sidecar is running and healthy.2. The proxy uses the correct instance connection name.
3. The GKE workload's SA has roles/cloudsql.client.
4. Private IP: Cloud SQL and GKE are in the same VPC or peered VPC.
5. Public IP: authorized networks include the GKE node/pod IPs.
3. GCE instance metadata shows the correct startup script but it didn't run. What do you check?
Show answer
1. Check serial port output (console log) for startup script errors.2. Verify the script is in the correct metadata key (startup-script vs startup-script-url).
3. Check if the script has execution errors (missing dependencies, wrong shebang).
4. Startup scripts run as root — check for permission issues in the script logic.
4. A Pub/Sub subscription has increasing unacked message backlog. What do you investigate?
Show answer
1. Subscriber is running and processing messages.2. Subscriber is acking messages (check for processing errors that skip ack).
3. Ack deadline is sufficient — messages redeliver if not acked in time.
4. Subscriber throughput vs publish rate.
5. Dead letter topic configured for poison messages.
6. Subscriber autoscaling is working.