Skip to content

What Happens When You Click a Link

  • lesson
  • dns
  • arp
  • tcp/ip
  • tls
  • http
  • load-balancing
  • routing ---# What Happens When You Click a Link

Topics: DNS, ARP, TCP/IP, TLS, HTTP, load balancing, routing Level: L1–L2 (Foundations → Operations) Time: 60–90 minutes Prerequisites: None (everything is explained from scratch)


The Mission

You type https://app.example.com/dashboard into your browser and press Enter. A page loads. It took 400 milliseconds.

In those 400 milliseconds, your computer talked to at least 6 different systems, used 4 different protocols, performed a cryptographic handshake, and traversed hardware you've never seen in a datacenter you'll never visit.

This lesson follows that click through every layer of the stack, in order, exhaustively. Nothing is out of scope — we follow the request wherever it goes. Some layers get a sentence, others get a deep dive, proportional to where the interesting complexity lives.

By the end, you'll understand what happens at every hop between "press Enter" and "page loads" — and more importantly, you'll know which layer to investigate when something breaks.


Step 1: The Browser Checks Its Caches

Before anything hits the network, your browser tries to avoid the network entirely.

Cache check order: 1. Browser memory cache — did we fetch this URL in the last few minutes? 2. Browser disk cache — is there a cached response with valid Cache-Control headers? 3. Service Worker — is there a registered worker intercepting this request?

If the cache has a valid response, we're done — no network, no DNS, no TLS. This is why Cache-Control: max-age=31536000 on static assets makes websites fast: the browser never asks the server at all.

If not cached, the browser needs an IP address. The URL has a hostname (app.example.com), but the network speaks IP addresses. Time for DNS.

Gotcha: Cache-Control: no-cache does NOT mean "don't cache." It means "cache it, but revalidate with the server before using it." The header that actually prevents caching is Cache-Control: no-store. This confuses everyone.


Step 2: DNS Resolution — Finding the IP Address

Your browser asks: "What is the IP address of app.example.com?"

This question goes through a hierarchy of caches before anyone actually looks it up:

Browser DNS cache
  → OS DNS cache (systemd-resolved, nscd, etc.)
    → /etc/hosts file (yes, this still wins)
      → Configured DNS resolver (/etc/resolv.conf)
        → Recursive resolver (your ISP or 8.8.8.8)
          → Root servers → .com servers → example.com authoritative server

The recursive resolver does the actual work. If it doesn't have the answer cached, it walks the hierarchy:

1. Ask a root server: "Where is .com?"
   → Root says: "Try 192.5.6.30 (a.gtld-servers.net)"

2. Ask the .com server: "Where is example.com?"
   → .com says: "Try 198.51.100.1 (ns1.example.com)"

3. Ask example.com's nameserver: "What is app.example.com?"
   → ns1 says: "It's 203.0.113.50, TTL 300 seconds"

Maximum 4 hops. In practice, the root and .com answers are almost always cached, so most lookups need only 1-2 queries.

Name Origin: DNS was invented in 1983 by Paul Mockapetris (RFC 882/883). Before DNS, every IP-to-hostname mapping on the internet was in a single file called HOSTS.TXT, managed by Elizabeth Feinler at Stanford Research Institute. Every computer on the internet periodically downloaded this file. By 1983, the internet had grown to several hundred hosts and the system was collapsing under its own weight. DNS replaced a flat file with a distributed, hierarchical database that now handles trillions of queries daily.

Trivia: There are 13 root server identities (named A through M), but over 1,700 physical servers worldwide. They use anycast — the same IP address is announced from multiple locations via BGP, and the internet's routing automatically sends your query to the nearest instance. The limit of 13 exists because the original root server list had to fit in a single 512-byte UDP DNS response.

Gotcha: /etc/hosts takes priority over DNS on most systems. This is a pre-DNS relic that still works — and is still actively exploited by malware to hijack domain resolution. When DNS debugging, always check /etc/hosts first.

What the browser actually got

The DNS response contains an A record (IPv4 address) or AAAA record (IPv6). If both exist, modern browsers use the Happy Eyeballs algorithm — they race IPv4 and IPv6 connections simultaneously and use whichever responds first. This solved the IPv6 transition without requiring users to make a choice.

# See what DNS returns (you'll use dig more than any other DNS tool)
dig app.example.com

# Just the answer, nothing else
dig +short app.example.com
# → 203.0.113.50

# Trace the full resolution path
dig +trace app.example.com

# Check a specific DNS server
dig @8.8.8.8 app.example.com

Step 3: ARP — Getting the MAC Address

We have an IP address: 203.0.113.50. But IP addresses don't exist on a wire — Ethernet frames use MAC addresses (48-bit hardware addresses burned into your network card).

Your computer needs to figure out: "What MAC address should I put on this Ethernet frame?"

The answer depends on whether the destination is on the local network or not:

If the destination is local (same subnet), your computer ARPs for the destination directly:

Your computer broadcasts: "Who has 203.0.113.50? Tell 192.168.1.100"
                          (sent to ff:ff:ff:ff:ff:ff — everyone on the LAN)

203.0.113.50 replies:     "203.0.113.50 is at aa:bb:cc:dd:ee:ff"
                          (sent directly to your MAC — unicast)

If the destination is remote (different subnet — almost always the case for internet traffic), your computer ARPs for the default gateway (your router) instead:

Your computer broadcasts: "Who has 192.168.1.1? Tell 192.168.1.100"
                          (asking for the router's MAC, not the server's)

Router replies:            "192.168.1.1 is at 00:11:22:33:44:55"

Your computer sends the IP packet (destination: 203.0.113.50) inside an Ethernet frame addressed to the router's MAC address. The router strips the Ethernet header, looks at the IP destination, consults its routing table, and wraps the packet in a new Ethernet frame for the next hop.

This is the fundamental trick of IP networking: IP addresses are end-to-end, MAC addresses are hop-by-hop.

Under the Hood: ARP was defined in RFC 826 (1982) by David Plummer. It's a broadcast protocol — every ARP request is heard by every device on the local network. This is why large flat networks (thousands of hosts on one subnet) have performance problems: ARP broadcasts consume bandwidth and CPU on every host. It's also why VLANs exist — to break large networks into smaller broadcast domains.

Name Origin: ARP stands for Address Resolution Protocol. It "resolves" a Layer 3 address (IP) to a Layer 2 address (MAC). The name is descriptive, not an acronym pun — rare in networking.

# See your ARP cache (recently resolved IP → MAC mappings)
ip neigh show
# or the older command:
arp -n

# Watch ARP traffic in real time
sudo tcpdump -i eth0 arp

# Manually resolve an IP
arping -I eth0 192.168.1.1

Gotcha: ARP has zero authentication. Any device on the LAN can claim to be any IP address — that's ARP spoofing. An attacker sends fake ARP replies saying "192.168.1.1 (the router) is at [attacker's MAC]", and all your traffic goes through them. This is why coffee shop WiFi with no encryption is dangerous, and why managed networks use Dynamic ARP Inspection (DAI) on switches to validate ARP against DHCP records.


Step 4: The TCP Handshake — Establishing a Connection

We have an IP address and a route to get there. Now we need a reliable connection. HTTP runs on TCP, so before any data flows, the client and server perform the three-way handshake:

Client → Server:  SYN         "I want to talk. My sequence number starts at 1000."
Server → Client:  SYN-ACK     "OK. My sequence number starts at 5000. I acknowledge your 1000."
Client → Server:  ACK         "I acknowledge your 5000. Let's go."

Three packets. One round trip. After this, both sides have synchronized sequence numbers and confirmed that both directions of communication work.

# Watch the handshake happen (run before the curl)
sudo tcpdump -i eth0 host 203.0.113.50 and port 443 -c 10

# In another terminal
curl -I https://app.example.com/dashboard

In tcpdump output, you'll see:

14:23:01 IP 192.168.1.100.48372 > 203.0.113.50.443: Flags [S], seq 1000
14:23:01 IP 203.0.113.50.443 > 192.168.1.100.48372: Flags [S.], seq 5000, ack 1001
14:23:01 IP 192.168.1.100.48372 > 203.0.113.50.443: Flags [.], ack 5001

[S] = SYN, [S.] = SYN+ACK, [.] = ACK. This three-packet dance is the theoretical minimum needed to establish reliable bidirectional communication.

Name Origin: TCP — Transmission Control Protocol — was designed by Vint Cerf and Bob Kahn in 1974 (RFC 675), later refined into TCP/IP (RFC 793, 1981). The "control" in the name refers to flow control (don't overwhelm the receiver) and congestion control (don't overwhelm the network). These are different problems solved by different mechanisms — people confuse them constantly.

Under the Hood: Each side picks a random Initial Sequence Number (ISN) rather than starting at 0. This prevents packets from old connections (still floating around the network) from being accepted as part of a new connection. The randomness also makes it harder for attackers to forge packets — they'd have to guess the sequence number.

Trivia: The SYN flood attack (1996, Panix ISP) weaponized the three-way handshake. The attacker sends millions of SYN packets with forged source addresses. The server allocates memory for each half-open connection and replies to addresses that don't exist. The SYN backlog fills, and legitimate connections can't get through. The defense — SYN cookies, invented by Daniel J. Bernstein — is elegant: the server encodes its state in the sequence number of the SYN-ACK, so it doesn't need to remember half-open connections at all. SYN cookies are still used today.

What about TIME_WAIT?

After the connection closes, the client enters TIME_WAIT state for 2 minutes (2x Maximum Segment Lifetime). This looks alarming in ss output — thousands of connections in TIME_WAIT — but it's normal and necessary. It ensures that delayed packets from the old connection don't get confused with a new connection on the same port.

# See connection states
ss -tan | awk '{print $1}' | sort | uniq -c | sort -rn

War Story: A team's load balancer started silently dropping new connections under load. No RST packets, no ICMP errors, no firewall logs — just black holes. After days of debugging, dmesg revealed nf_conntrack: table full, dropping packet. The connection tracking table (default 262,144 entries) was full — 80% of entries were TIME_WAIT from a service that created new connections for every request instead of using connection pooling. Fix: increase nf_conntrack_max, reduce nf_conntrack_tcp_timeout_time_wait from 120 to 30 seconds, and add connection pooling.


Step 5: TLS Handshake — Establishing Trust and Encryption

The TCP connection is up, but it's plaintext. The URL starts with https://, so before any HTTP data flows, we need a TLS handshake.

TLS does three things: 1. Authentication — proves you're talking to the real app.example.com, not an impostor 2. Encryption — makes the data unreadable to anyone watching the network 3. Integrity — ensures nobody tampered with the data in transit

The modern TLS 1.3 handshake takes just one round trip (down from two in TLS 1.2):

Client → Server:  ClientHello
                  "I support TLS 1.3, these cipher suites, and here's my key share"

Server → Client:  ServerHello + Certificate + CertificateVerify + Finished
                  "I picked these parameters. Here's my certificate chain proving
                   I'm app.example.com. Here's cryptographic proof I hold the
                   private key. Here's the integrity check."

Client → Server:  Finished
                  "Verified. Here's my integrity check. Application data can flow."

Certificate verification

The server sends a certificate chain: the server certificate, signed by an intermediate CA, signed by a root CA that your browser trusts. The browser verifies:

  1. The certificate's hostname matches app.example.com (or a wildcard *.example.com)
  2. The chain links to a trusted root CA in the browser's trust store
  3. None of the certificates are expired
  4. The cryptographic signatures are valid

If any check fails, you get the browser warning page. If all pass, encryption begins.

# Inspect a server's TLS certificate
openssl s_client -connect app.example.com:443 -servername app.example.com < /dev/null 2>/dev/null | openssl x509 -text -noout

# Just the expiration dates
echo | openssl s_client -connect app.example.com:443 -servername app.example.com 2>/dev/null | openssl x509 -dates -noout

# Test TLS handshake with verbose output
curl -vI https://app.example.com 2>&1 | grep -E '^\*'

Name Origin: SSL (Secure Sockets Layer) was invented at Netscape in 1994 by Taher Elgamal. SSL 1.0 had critical flaws and was never released. SSL 2.0 shipped with Netscape Navigator 1.1 in 1995. When the IETF took over the standard, Microsoft insisted on renaming it to signal it wasn't Netscape-proprietary — "Transport Layer Security" was the compromise. TLS 1.0 (1999) is essentially SSL 3.1 with minor changes. We still say "SSL certificates" even though SSL itself has been dead since 2015.

Under the Hood: TLS 1.3 uses ephemeral key exchange — fresh session keys are derived for every connection. Even if an attacker steals the server's private key later, they can't decrypt past traffic. This is called forward secrecy, and it became urgent after the Snowden revelations (2013) showed that intelligence agencies were recording encrypted traffic for potential future decryption.

Trivia: Before Let's Encrypt launched in 2015, TLS certificates cost $10–$300/year and required manual setup. Let's Encrypt made them free and automated via the ACME protocol. As of 2024, they've issued over 4 billion certificates and serve 360+ million websites. This single project did more for internet encryption than any government mandate or industry standard.

Gotcha: SNI (Server Name Indication) sends the hostname in plaintext during the TLS handshake — before encryption begins. This means network observers can see which site you're visiting even with HTTPS. They can't see the content, but they can see the destination. Encrypted Client Hello (ECH) aims to fix this, but adoption is still early.


Step 6: HTTP Request — Asking for the Page

The TLS tunnel is up. Now the browser sends the actual HTTP request:

GET /dashboard HTTP/2
Host: app.example.com
User-Agent: Mozilla/5.0 ...
Accept: text/html,application/xhtml+xml
Accept-Encoding: gzip, br
Cookie: session=abc123

Key parts:

Part What it does
GET HTTP method — "give me this resource" (no side effects)
/dashboard Path — what we want
HTTP/2 Protocol version — multiplexed, binary, compressed headers
Host: Which website on this server (enables virtual hosting — one IP, many sites)
Cookie: Session state from a previous visit
Accept-Encoding: "I can decompress gzip and Brotli — send compressed if you can"

Trivia: The Host header is mandatory in HTTP/1.1 (1997) and is what makes virtual hosting work — one server, one IP address, hundreds of websites. Without it, every website would need its own IP address. When you see "400 Bad Request" from a server, a missing Host header is one of the most common causes.

Trivia: The HTTP Referer header is a permanent misspelling of "Referrer." Phillip Hallam-Baker made the typo in RFC 1945 (1996). By the time anyone noticed, too much software depended on the misspelling to fix it. Newer web APIs use the correct spelling "Referrer" — but the HTTP header will be misspelled forever.

Name Origin: HTTP was created by Tim Berners-Lee at CERN in 1989-1991 to help physicists share documents. The original HTTP/0.9 had only one method (GET), no headers, no status codes, and no content types. The response ended when the server closed the connection. The entire protocol fit in one paragraph. He had no idea it would become the foundation of the modern internet economy.

The URL fragment trick

Notice that the URL had no fragment (#section). If it did, the browser would NOT send it to the server — the # and everything after it is handled entirely client-side. The server never sees it. This is why Single-Page Applications use fragment-based routing (#/dashboard) or the History API — the server serves the same page regardless.


Step 7: Load Balancer and Reverse Proxy — Getting to the Right Server

The HTTP request hits 203.0.113.50 — but that's probably not your application server. It's a load balancer or reverse proxy that distributes requests across multiple backend servers.

The load balancer: 1. Terminates TLS (decrypts your request) 2. Inspects the HTTP headers (Host, path, cookies) 3. Picks a backend server based on its algorithm 4. Opens a new connection to the backend (possibly with a fresh TLS handshake) 5. Forwards your request

Client → Load Balancer (203.0.113.50)
         ↓ picks backend based on round-robin, least-connections, or consistent hashing
         → Backend A (10.0.1.10)
         → Backend B (10.0.1.11)  ← selected
         → Backend C (10.0.1.12)

Common load balancing algorithms:

Algorithm How it works Trade-off
Round-robin Rotate through backends in order Simple but ignores server load
Least connections Pick the backend with fewest active connections Better under uneven load, but favors fast servers
Consistent hashing Hash the request (by IP, cookie, or header) to a backend Same client hits same server — good for caches, bad for failover
Random Pick a random backend Surprisingly effective, simple to implement

Trivia: NGINX was written by Igor Sysoev starting in 2002 to solve the C10K problem — how to handle 10,000 concurrent connections. At the time, most servers used one thread per connection, so 10K connections meant 10K threads. Sysoev, a Russian sysadmin, worked alone for two years before releasing it in 2004. NGINX now serves over 30% of all websites. F5 Networks acquired it for $670 million in 2019.

Trivia: HAProxy was written by Willy Tarreau in 2000 for a French ISP. He maintained it solo for over a decade. It serves GitHub, Reddit, Stack Overflow, and some of the world's busiest sites.


Step 8: The Application Server Responds

The request finally reaches your application server. The app:

  1. Reads the session cookie, looks up the user
  2. Queries a database for the dashboard data
  3. Renders the HTML template
  4. Returns the response
HTTP/2 200 OK
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Cache-Control: private, max-age=0
Set-Cookie: session=abc123; Secure; HttpOnly; SameSite=Lax
Strict-Transport-Security: max-age=63072000
Content-Security-Policy: default-src 'self'

<!DOCTYPE html>
<html>...

Key response headers:

Header What it does
200 OK Status code — success. 4xx = your fault, 5xx = server's fault
Content-Encoding: gzip Response is compressed — browser will decompress
Cache-Control: private Only the browser can cache this, not CDNs (it's user-specific)
Set-Cookie: ... Secure; HttpOnly; SameSite=Lax Session cookie with security flags
Strict-Transport-Security HSTS — "from now on, only use HTTPS for this domain"
Content-Security-Policy CSP — "only load resources from this domain" — prevents XSS

Trivia: HTTP status codes were inspired by FTP's three-digit reply codes (1971). The system is deliberately extensible: 418 "I'm a Teapot" is from a 1998 April Fools' RFC about the Hyper Text Coffee Pot Control Protocol. Despite being a joke, it's implemented by Google, Node.js, and many frameworks. Attempts to reclaim the code for real use have been defeated by developer outcry. Meanwhile, 451 "Unavailable for Legal Reasons" (2015) is explicitly a reference to Ray Bradbury's Fahrenheit 451.

Gotcha: The Secure flag on cookies means "only send over HTTPS." Without it, the cookie is sent over plain HTTP too — anyone sniffing the network can steal the session. HttpOnly means JavaScript can't read the cookie — mitigates XSS attacks. SameSite controls whether the cookie is sent on cross-site requests — mitigates CSRF. A production cookie needs all three.


Step 9: The Response Travels Back

The response follows the same path in reverse:

App server (10.0.1.11)
  → Load balancer (re-encrypts with client's TLS session)
    → Internet routing (BGP, multiple ISP hops)
      → Your router
        → Your computer
          → Browser decrypts TLS, decompresses gzip, parses HTML
            → Renders the page

But the page isn't done. The HTML references CSS, JavaScript, images, and fonts — each one triggers a new request through (most of) this same pipeline. HTTP/2 multiplexes these requests over a single TCP connection, so there's no handshake overhead for subsequent requests. A modern page might make 30-100 additional requests to fully render.

Under the Hood: HTTP/1.1 suffered from head-of-line blocking — only one request could be in flight per TCP connection. Browsers worked around this by opening 6 parallel TCP connections per domain. HTTP/2 (2015, based on Google's SPDY protocol) fixed this with multiplexing — many requests and responses flow interleaved on a single connection. HTTP/3 (2022) goes further — it abandons TCP entirely for QUIC (UDP-based), eliminating TCP-level head-of-line blocking. Google tested QUIC in Chrome since 2013 before standardizing it.


The Complete Journey — One Picture

[1] Browser cache check → miss

[2] DNS: browser cache → OS cache → /etc/hosts → resolv.conf →
    recursive resolver → root → .com → example.com authoritative
    Result: 203.0.113.50 (TTL 300s)

[3] ARP: "Who has 192.168.1.1 (gateway)?" → router's MAC
    (because 203.0.113.50 is remote, we ARP for the gateway)

[4] TCP: SYN → SYN-ACK → ACK
    (three packets, one round trip, sequence numbers synchronized)

[5] TLS 1.3: ClientHello → ServerHello+Cert+Verify+Finished → Finished
    (one round trip, certificate chain verified, forward-secret keys derived)

[6] HTTP/2: GET /dashboard → through TLS tunnel

[7] Load balancer: terminate TLS, inspect headers, pick backend,
    forward to 10.0.1.11

[8] App server: authenticate, query DB, render, respond 200 OK

[9] Response: back through LB → internet → your router → your browser
    → decrypt → decompress → parse → render → you see the page

Total round trips for first request: DNS (1) + TCP (1) + TLS (1) + HTTP (1) = ~4 round trips minimum. At 50ms latency, that's 200ms before the first byte of the response. Caching, connection reuse, and HTTP/2 multiplexing reduce subsequent requests to ~1 round trip each.


Flashcard Check

Q1: What order does DNS resolution check? (browser, OS, hosts file, etc.)

Browser cache → OS cache → /etc/hosts → configured resolver → recursive resolver → authoritative servers. /etc/hosts still wins over DNS on most systems.

Q2: Why does your computer ARP for the router instead of the destination server?

Because the destination is on a different subnet. MAC addresses are hop-by-hop; IP addresses are end-to-end. You send the packet to the router's MAC, which forwards it.

Q3: What three things does TLS provide?

Authentication (proving identity), encryption (confidentiality), and integrity (tamper detection). These are different — you can have encryption without authentication, which is still vulnerable to MITM.

Q4: What does a SYN flood attack exploit?

The three-way handshake. Attacker sends millions of SYNs with forged source IPs. Server allocates memory for half-open connections. SYN cookies defend against this statelessly.

Q5: Cache-Control: no-cache — does it prevent caching?

No. It means "cache but revalidate every time." no-store prevents caching entirely.

Q6: Why is the HTTP Referer header misspelled?

Typo by Phillip Hallam-Baker in RFC 1945 (1996). By the time anyone noticed, too much software depended on the misspelling. It will be misspelled forever.

Q7: What is forward secrecy and why does it matter?

Fresh session keys per connection. Even if the server's private key is compromised later, past traffic can't be decrypted. Important because agencies were recording encrypted traffic for future decryption (Snowden, 2013).

Q8: What is nf_conntrack: table full, dropping packet?

The kernel's connection tracking table is full. New connections are silently dropped — no RST, no logs, just black holes. Increase nf_conntrack_max and add connection pooling.


Exercises

Exercise 1: Trace a real DNS resolution (hands-on)

Use dig to trace the full resolution path for any domain you choose:

dig +trace example.com

Identify: which server answered at each level? What were the TTLs? How many hops did it actually take?

What to look for You'll see delegations from root (`.`) → TLD (`.com`) → authoritative. Most interesting: the root servers are served by anycast (same IP, multiple physical locations). The TTLs on root and TLD nameserver records are usually very long (48 hours), while the final A record TTL varies by site (30 seconds to 24 hours).

Exercise 2: Watch the full handshake (hands-on)

Capture the TCP+TLS handshake for a real HTTPS request:

sudo tcpdump -i eth0 -c 20 host <some-ip> and port 443 &
curl -sI https://example.com > /dev/null

In the output, identify: 1. The three TCP handshake packets (SYN, SYN-ACK, ACK) 2. The TLS ClientHello (first data packet after the handshake) 3. The TLS ServerHello + certificates (usually the largest packet)

Hint TCP flags in tcpdump: `[S]` = SYN, `[S.]` = SYN-ACK, `[.]` = ACK, `[P.]` = PSH+ACK (data). The first `[P.]` after the three-way handshake is the TLS ClientHello.

Exercise 3: Verify a TLS certificate chain (hands-on)

Pick any HTTPS site and verify its certificate:

echo | openssl s_client -connect example.com:443 -servername example.com 2>/dev/null | openssl x509 -text -noout

Answer: 1. Who issued the certificate? (Look for Issuer:) 2. When does it expire? 3. What hostnames does it cover? (Look for Subject Alternative Name) 4. Is there forward secrecy? (Check the cipher suite — look for ECDHE or DHE)

How to check forward secrecy
echo | openssl s_client -connect example.com:443 2>/dev/null | grep "Cipher"
If the cipher suite contains `ECDHE` (Elliptic Curve Diffie-Hellman Ephemeral) or `DHE` (Diffie-Hellman Ephemeral), you have forward secrecy. If it contains `RSA` for key exchange (not just authentication), you don't.

Exercise 4: Measure each layer's contribution to latency (hands-on)

Use curl to measure how long each phase takes:

curl -w "\
    DNS:        %{time_namelookup}s\n\
    TCP:        %{time_connect}s\n\
    TLS:        %{time_appconnect}s\n\
    First byte: %{time_starttransfer}s\n\
    Total:      %{time_total}s\n" \
    -o /dev/null -s https://example.com

Run it 5 times and compare. Which phases are consistent? Which vary? Why?

What to expect DNS is often fast (cached) after the first request. TCP and TLS are proportional to network latency (one round trip each). First byte includes server processing time. The variation between runs tells you what's being cached (DNS) vs. what's network-dependent (TCP/TLS).

Exercise 5: The decision (think, don't code)

A page loads in 3 seconds instead of 400ms. Based on what you've learned, which layer would you investigate first for each symptom?

  1. DNS takes 2.5 seconds (the rest is fast)
  2. TCP connect takes 800ms
  3. TLS takes 1.5 seconds
  4. First byte takes 2 seconds (TCP and TLS are fast)
  5. Total time is 3 seconds but first byte was 200ms
Answers 1. **DNS:** Slow or unreachable upstream resolver. Check `resolv.conf`, try a different resolver (`dig @8.8.8.8`). Could be ndots issues in Kubernetes (every dotted name gets 5 search domain appends before the real lookup). 2. **TCP:** High network latency or packet loss. Check with `ping` and `mtr`. Could be geographic distance, congested link, or routing problem. 3. **TLS:** Large certificate chain, slow server crypto, or MTU issues causing fragmentation. Check certificate chain size with `openssl s_client`. If ICMP "Fragmentation Needed" is blocked, Path MTU Discovery fails and TLS handshake fragments are dropped silently. 4. **First byte slow, network fast:** Server-side problem — slow database query, cold cache, resource contention. Check application and database logs. 5. **First byte fast, total slow:** Large response body or many sub-resources. Check `Content-Length`, compression (`Accept-Encoding`), and the number of subsequent requests the page triggers. HTTP/2 should be multiplexing these.

Cheat Sheet

DNS

Task Command
Resolve a name dig +short example.com
Full resolution trace dig +trace example.com
Query specific server dig @8.8.8.8 example.com
Reverse lookup dig -x 203.0.113.50
Check all record types dig example.com ANY

TCP/Network

Task Command
ARP cache ip neigh show
Routing table ip route show
Connection states ss -tan
Capture packets sudo tcpdump -i eth0 host X port Y
Trace route mtr -n example.com
Test TCP connectivity nc -zv host port

TLS

Task Command
Inspect certificate echo \| openssl s_client -connect host:443 2>/dev/null \| openssl x509 -text
Check expiration echo \| openssl s_client -connect host:443 2>/dev/null \| openssl x509 -dates
Test handshake curl -vI https://host 2>&1 \| grep '^\*'

HTTP Timing

curl -w "DNS:%{time_namelookup} TCP:%{time_connect} TLS:%{time_appconnect} TTFB:%{time_starttransfer} Total:%{time_total}\n" -o /dev/null -s URL

HTTP Status Codes

Range Meaning Common ones
2xx Success 200 OK, 201 Created, 204 No Content
3xx Redirect 301 Permanent, 302 Found, 307 Temporary
4xx Client error 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found
5xx Server error 500 Internal, 502 Bad Gateway, 503 Unavailable

Takeaways

  1. 9 steps from click to page. Browser cache → DNS → ARP → TCP → TLS → HTTP → load balancer → app → response. Every one of them can fail independently.

  2. IP addresses are end-to-end, MAC addresses are hop-by-hop. Your packet gets rewrapped in a new Ethernet frame at every router. Only the IP header survives the journey.

  3. DNS is hierarchical and heavily cached. Root → TLD → authoritative, with caching at every level. Most lookups need 1-2 queries, not 4.

  4. TLS 1.3 is one round trip. ClientHello and ServerHello, with forward-secret key exchange. Certificate chain verification happens locally in the browser.

  5. When something is slow, measure each layer. curl -w with timing variables tells you exactly which layer is the bottleneck. Don't guess — measure.

  6. The conntrack table is a silent killer. When it fills, connections are dropped with no logs and no errors. dmesg is the only place you'll see it.


  • The Hanging Deploy — incident-driven look at processes, signals, and systemd
  • Connection Refused — differential diagnosis of a common error across layers