Skip to content

Grading Checklist

A good response must include:

  • Identifies the root cause: the aggressive hold timer (15s) combined with intermittent packet loss on a degraded link causes keepalive misses that trigger session drops
  • Explains how BGP hold timers work: if no keepalive or update is received within the hold time, the session is declared down
  • Correlates the link errors (CRC, input errors from the media converter) with keepalive packet loss
  • Notes that 3 consecutive missed keepalives (at 5s intervals) within the 15s hold timer is very easy to trigger with even modest packet loss
  • Proposes the two-part fix: (1) fix the physical layer issue (media converter) and (2) increase hold timer to a reasonable value
  • Recommends standard hold timer values (90s keepalive 30s, or 60s/20s) unless BFD is available for fast failover
  • Suggests using BFD for sub-second failure detection instead of aggressive BGP timers
  • Recommends investigating the media converter (replace it, check SFP, check cable)
  • Mentions route dampening as a mitigation for flapping routes affecting downstream routers
  • Shows how to read the BGP event log to confirm hold timer expiry as the cause
  • Warns that aggressive timers without a clean link cause more outages than they prevent