Graphql¶

36 cards — 🟢 9 easy | 🟡 15 medium | 🔴 6 hard

🟢 Easy (9)¶

1. What are the five built-in scalar types in GraphQL SDL?

Show answer

String, Int, Float, Boolean, and ID. ID is serialized as a String but signals to the type system that it is an opaque, unique identifier rather than human-readable text.

Name origin: GraphQL was created by Facebook in 2012, open-sourced in 2015. The Graph refers to the application data graph it exposes.

Remember: SIFBI — String, Int, Float, Boolean, ID. Five built-in scalars.

Gotcha: Custom scalars (DateTime, JSON, URL) must be explicitly defined and have serialization logic.

2. What does the exclamation mark (!) mean in a GraphQL type declaration?

Show answer

It marks the field as non-nullable — the server guarantees it will never return null for that field. A field without ! is nullable and may return null. A list type like [Order!]! means the list itself is non-null and every element in it is also non-null.

3. What are the three root operation types in GraphQL?

Show answer

Query (read-only data fetching), Mutation (state-changing operations), and Subscription (real-time event streams). Every GraphQL document must specify one of these as its operation type.

Remember: QMS — Query (read), Mutation (write), Subscription (stream). Three root operations.

Gotcha: Despite the name, a Query can trigger side effects in a poorly designed server. The convention is that only Mutations change state.

Fun fact: Subscriptions use WebSocket (graphql-ws protocol) or Server-Sent Events for real-time updates.

4. Why does GraphQL use separate Input types for mutations instead of reusing Object types?

Show answer

Object types can contain fields that reference other Object types and may include computed or resolver-backed fields that don't make sense as inputs. Input types are simpler — they are plain data bags with scalar and enum fields. The separation keeps the read schema and write schema distinct and prevents accidentally exposing server-side computed fields as writable arguments.

5. What are the four arguments passed to every GraphQL resolver function?

Show answer

(1) parent (or root): the resolved value of the parent field. (2) args: the arguments passed to this field in the query. (3) context: shared request-scoped object containing DB connections, auth info, DataLoader instances, etc. (4) info: the query AST, field path, and schema — used for optimizations and projections.

6. What problems does GraphQL solve that REST APIs commonly exhibit?

Show answer

Over-fetching: REST endpoints return fixed shapes — clients receive fields they don't need. Under-fetching: a single screen may require multiple REST round trips to assemble its data. GraphQL lets clients request exactly the fields they need in a single request, eliminating both problems.

7. What is GraphQL introspection and why is it disabled in production?

Show answer

Introspection is a built-in mechanism that lets clients query the schema itself — listing all types, fields, arguments, directives, and deprecation notes. It powers tools like GraphiQL. In production it is disabled because it exposes the full API surface to potential attackers, including deprecated field names that may contain migration hints revealing internal design decisions.

8. What HTTP status code does a GraphQL server return for a query that partially fails?

Show answer

200 OK, even when errors occurred. GraphQL returns partial data alongside an errors array. The presence of HTTP 200 does not mean the response is error-free — callers must always check for an errors field in the response body.

9. What is a GraphQL fragment and when do you use one?

Show answer

A fragment is a reusable selection set named and defined separately from a query. You use fragments to avoid repeating the same field list across multiple queries, to share field selections between operations, and to co-locate data requirements with the UI component that uses them (the Relay/Colocated Fragments pattern).

🟡 Medium (15)¶

1. Explain the N+1 problem in GraphQL and how DataLoader solves it.

Show answer

N+1 occurs when a query fetches a list of N items and each item's resolver fires an independent DB query to load a related resource. 100 orders with a user resolver generates 101 queries. DataLoader solves it by batching: it collects all .load() calls within the same event loop tick, fires a single batch query for all IDs, and returns results mapped back to each caller. This reduces 101 queries to 2.

2. Why must DataLoader instances be created per-request, not as module-level singletons?

Show answer

DataLoader maintains an in-memory cache keyed by ID. A singleton would accumulate stale entries across requests and serve one user's data to another user (cache poisoning). Per-request instances are created fresh in the context factory, share their cache only within that request's resolver tree, and are garbage-collected when the request ends.

3. What is the difference between offset pagination and cursor pagination in GraphQL, and when should you prefer each?

Show answer

Offset pagination uses limit/offset arguments and is simple but unstable under concurrent writes — new inserts shift rows and cause duplicates or skips across pages. Cursor pagination uses opaque cursors (typically base64-encoded IDs or timestamps) and is stable under writes and efficient at any depth. Use cursor pagination for any list that grows under production load; use offset only for small, stable, admin-facing lists.

4. What are the five standard fields in a Relay-spec connection type?

Show answer

A connection type contains: edges (list of edge objects), pageInfo (pagination metadata), and optionally totalCount. Each edge contains: node (the actual item) and cursor (opaque pointer for pagination). pageInfo contains: hasNextPage, hasPreviousPage, startCursor, and endCursor.

5. What should every GraphQL error extension include for programmatic handling?

Show answer

A machine-readable code in the extensions object (e.g., extensions.code: "NOT_FOUND"). Clients must never parse the human-readable message string for logic decisions — it can change. The code field enables stable error handling, retry logic, and localized error display. Additional extension fields like the resource type or ID help callers understand context without embedding it in the message.

6. How do query depth limits and complexity scoring protect a GraphQL API from abuse?

Show answer

Depth limits reject queries nested beyond a threshold (e.g., 5 levels), blocking recursive fan-out attacks. Complexity scoring assigns a cost to each field (with list fields multiplied by requested count) and rejects queries whose total cost exceeds a budget. Together they cap the worst-case work any single query can force the server to perform, making the API resistant to both adversarial and accidental expensive queries.

7. What are Automatic Persisted Queries (APQ) and what security benefit do they provide?

Show answer

APQ is a protocol where clients send only a SHA-256 hash of the query. On first request, the server returns PersistedQueryNotFound; the client resends with the full query and hash, and the server caches it. Subsequent requests use the hash only. Security benefit: once deployed, only pre-registered queries are accepted — arbitrary query injection is blocked, and the executable query surface is bounded to what clients actually ship.

8. What is the recommended WebSocket subprotocol for GraphQL subscriptions and what replaced the original one?

Show answer

The current recommended protocol is graphql-ws (implemented in the graphql-ws npm library). It replaced subscriptions-transport-ws, which was abandoned in 2020. The old library had open security issues and was not being maintained. Teams migrating from Apollo Client 2 to Apollo Client 3 needed to switch transports. Server-Sent Events (SSE) over HTTP is an alternative transport now covered by the GraphQL over HTTP specification (2023).

9. What is Apollo Federation and what problem does it solve?

Show answer

Apollo Federation enables multiple independent GraphQL services (subgraphs) to each own a slice of the schema. An Apollo Router composes them into a unified supergraph and routes query fragments to the right service. It solves the monolith vs. fragmentation problem: without federation, either one team owns the entire schema (bottleneck) or each service exposes its own API (no unified graph). Federation lets teams move independently while presenting a single endpoint to clients.

10. Which schema changes are breaking and which are non-breaking?

Show answer

Non-breaking: adding a new field, adding a nullable argument, deprecating a field, adding an enum value clients don't switch on. Breaking: removing a field, renaming a field, changing a field's type, adding a required (non-null) argument to an existing field, removing an enum value. The safe migration path for any breaking change is deprecation first → one release cycle → removal.

11. Why is HTTP response caching harder with GraphQL than with REST, and what are the main workarounds?

Show answer

REST maps resources to URLs, so GET /users/42 is naturally cacheable by CDNs and browsers. GraphQL typically uses POST to /graphql with the query in the body — POST is not cached by default. Workarounds: (1) Persisted queries sent as GET requests with a hash parameter — CDN-cacheable. (2) @cacheControl directives that instruct Apollo Server to set Cache-Control headers. (3) Application-level caching via DataLoader (request-scoped) and Redis (cross-request).

12. What are GraphQL variables and why should you always use them instead of string interpolation?

Show answer

Variables are typed, named inputs declared in the operation signature and passed alongside the query as a separate JSON object. They prevent injection attacks (analogous to parameterized SQL vs. string-interpolated SQL), enable query plan caching on the server (same query shape with different variable values reuses the parsed AST), and make queries reusable across different inputs without string manipulation.

13. What are the built-in execution directives @skip and @include used for?

Show answer

@include(if: Boolean) and @skip(if: Boolean) conditionally include or exclude fields in a query based on a variable value. They let a single query document handle multiple display states without sending multiple queries.
Example: query { user { id name avatar @include(if: $showAvatar) { url } } }. The condition is evaluated per-request, so clients can toggle fields based on feature flags or user settings.

14. What are the trade-offs between schema-first and code-first GraphQL development?

Show answer

Schema-first: write SDL first, implement resolvers to match. The schema is a readable, language-agnostic contract that exists before any code runs — ideal for multi-team federation and schema registry workflows. Drift risk if codegen is skipped. Code-first: decorators or builder APIs derive the schema from code. Types stay in sync with resolvers by construction, better IDE support. Schema is implicit and requires running the server to extract — harder to share as a contract.

15. When do you use a GraphQL union vs. an interface?

Show answer

Interface: multiple types share a common set of fields that clients will always query (e.g., Node with id, or Animal with name and species). Use when the shared fields drive client logic. Union: types have no shared fields but a resolver might return any of them (e.g., SearchResult = User | Order | Product). Use when the types are genuinely heterogeneous. Unions require inline fragments to query type-specific fields; interfaces allow querying shared fields directly plus inline fragments for type-specific ones.

🔴 Hard (6)¶

1. How does Apollo Router plan and execute a federated query across subgraphs?

Show answer

The router receives the full client query and uses the composed supergraph schema to build a query plan — a tree of fetch operations. It identifies which fields belong to which subgraphs, parallelizes independent fetches, and sequences dependent fetches (e.g., fetch User from users-service, then use the returned id to fetch Orders from orders-service via the @key extension mechanism). Results are merged and returned to the client as a single response. The query plan is computed from the schema, not hardcoded.

2. What is the contract a DataLoader batch function must fulfill, and what happens if it is violated?

Show answer

The batch function receives an array of keys and must return a Promise that resolves to an array of values in the exact same order and length as the input keys. If a key has no result, return null or an Error instance at that index — never skip an index or change the order. Violating this contract causes DataLoader to resolve the wrong value for each .load() call, producing silent data corruption where users see other users' data or resolvers return incorrect objects.

3. What mechanism ensures subscription resolvers clean up their PubSub listeners when a client disconnects?

Show answer

When using async generators for subscriptions (the graphql-ws protocol), the generator's finally block is guaranteed to execute when the generator is closed, whether by normal completion, client disconnect, or an error. The transport layer calls .return() on the generator on disconnect, which triggers the finally block. This is where you call pubsub.unsubscribe(), remove event listeners, or clear intervals — ensuring no zombie listeners outlive the connection.

4. How do you assign complexity costs to fields that return paginated lists, and why does the list size argument matter?

Show answer

Paginated list fields multiply their base cost by the first or limit argument. A field returning 100 items is ~100x more expensive than one returning 1. Without argument-aware cost calculation, a client requesting first: 1000 would pay the same complexity cost as first: 1, defeating the purpose of limits. Tools like graphql-cost-analysis let you define listFactor multipliers on specific fields, so the complexity engine correctly scales cost with the requested page size.

5. What is the error union pattern and when is it preferred over returning null with errors?

Show answer

The error union pattern defines a union return type for mutations that can include both success and typed failure cases: UpdateOrderResult = Order | OrderNotFound | ValidationError. Each error type has specific fields. Clients pattern-match on __typename to handle each case. It is preferred when: (1) errors are expected domain outcomes (not infrastructure failures), (2) clients need structured error data (not just a message), (3) you want exhaustive error handling enforced by the type system. Null + errors array is acceptable for simple read queries on nullable fields but lacks type safety.

6. Describe a CI pipeline gate that prevents accidental breaking schema changes from reaching production.

Show answer

Steps: (1) Export the current production schema via introspection and store it as a baseline artifact. (2) In CI, after any change to the schema file, run graphql-inspector diff

--fail-on-breaking. (3) Block the deployment if breaking changes are detected unless a --allow-breaking override is explicitly set and reviewed. (4) On successful deployment, update the baseline artifact in the schema registry with the new schema. This creates a guardrail that forces engineers to consciously acknowledge breaking changes rather than shipping them accidentally.