Skip to content

Generativeai

← Back to all decks

23 cards — 🟢 4 easy | 🟡 7 medium | 🔴 4 hard

🟢 Easy (4)

1. Beginner Explanation?

Show answer FAISS is a library that allows you to quickly find similar items in a large dataset of vectors. For example, if you have a sentence embedding vector for the query "I like to play football", FAISS can efficiently search through millions or billions of other sentence embedding vectors to find the ones that are most similar.

To use FAISS, you first need to create an index from your dataset of vectors. This involves some preprocessing to optimize the index for fast similarity search.

2. Learning the Data Distribution?

Show answer Generative models learn the probability distribution of the training data. This allows them to generate new samples that are statistically similar to the original data[2].

Analogy: Like a music student who studies thousands of songs until they can compose new melodies that sound right — they learned the distribution of notes and rhythms.

Remember: Generative models learn P(data) — the probability distribution. Discriminative models learn P(label|data) — the decision boundary.

3. Evolution of Next Word Prediction Models?

Show answer Before Transformers, sequence models evolved through:

- **RNNs**: Process sequences via hidden states but struggle with long-range dependencies (vanishing gradients)
- **LSTMs**: Add memory cells to retain information over longer sequences, solving the vanishing gradient problem
- **GRUs**: Simplify LSTMs by merging cell/hidden state — more efficient with similar long-range capability

All three are limited by sequential processing, preventing parallelization and limiting scalability.

4. Sampling from the Learned Distribution?

Show answer Once the model has learned the data distribution, it can sample from this distribution to generate new samples. This sampling process introduces randomness, which allows the model to produce varied outputs for the same input[1].

🟡 Medium (7)

1. Mathematical Intuition of Attention Block?

Show answer Each input is projected into three vectors: **Query (Q)**, **Key (K)**, and **Value (V)**. The attention score is computed as:

`Attention(Q, K, V) = softmax(QK^T / sqrt(d_k)) * V`

The dot product of Q and K measures relevance between tokens, scaling by `sqrt(d_k)` prevents large values from dominating softmax, and the result weights the V vectors to produce context-aware outputs. **Multi-head attention** runs multiple independent attention functions in parallel and concatenates their outputs, allowing the model to attend to different representation subspaces simultaneously.

2. Advanced Explanation of FAISS?

Show answer FAISS indexes large sets of dense vectors for fast similarity search. Key components:

- **Distance metrics**: L2 (Euclidean) or inner product (cosine similarity when normalized)
- **Flat Index**: Brute-force exact search — accurate but slow for large datasets
- **IVF (Inverted File Index)**: Partitions vectors into clusters; searches only nearest clusters for speed
- **Product Quantization (PQ)**: Compresses vectors into compact codes to reduce memory; approximates distances
- **HNSW**: Graph-based index for fast approximate nearest neighbor search
\

3. What are vector databases and what problems do they solve?

Show answer Vector databases store high-dimensional embedding vectors and are optimized for similarity search (nearest neighbor retrieval), unlike traditional databases which handle structured queries and exact matches.

Key concepts:
- **ANN search**: Approximate nearest neighbor using LSH, HNSW graphs, or IVF to trade some accuracy for speed
- **Dimensionality handling**: Specialized indexes (HNSW, PQ) plus optional PCA/t-SNE reduction
- **Popular systems**: Pinecone (managed), Milvus (open-source, scalable), Weaviate (GraphQL API)\

4. Generative AI Fundamentals?

Show answer Key fundamentals of Generative AI:

- **Discriminative vs Generative**: Discriminative models learn decision boundaries; generative models learn data distributions to create new samples
- **Latent space**: Lower-dimensional encoding of data features that enables meaningful sample generation
- **Evaluation metrics**: Inception Score (quality/diversity), FID (statistical similarity to real data), human evaluation
- **Mode collapse** (GANs): Mitigated with mini-batch discrimination, spectral normalization, WGAN-GP loss\

5. Intermediate Explanation?

Show answer FAISS searches a large vector dataset (e.g., 1 billion embeddings) efficiently:
1. **Preprocessing**: Builds an index by clustering vectors and encoding them with product quantization to reduce memory
2. **Searching**: For a query vector, identifies the nearest clusters first, then only compares within those clusters
3. **Ranking**: Returns the top-k most similar vectors by score

Optimized with multi-threading and GPU acceleration for fast search even at billion-vector scale.

6. Variational Autoencoders (VAEs)?

Show answer Variational Autoencoders (VAEs) learn a latent representation of data and use it to generate new samples. They are trained to maximize the likelihood of training data under the learned generative model. Unlike GANs, VAEs provide an explicit probabilistic framework with an encoder (data to latent space) and decoder (latent space to data).

7. Adversarial Training (GANs)?

Show answer One popular type of generative model is the Generative Adversarial Network (GAN). GANs consist of two neural networks - a generator and a discriminator. The generator generates new samples, while the discriminator tries to distinguish between real and generated samples. Through this adversarial training process, the generator learns to produce more realistic samples that can fool the discriminator[2].

🔴 Hard (4)

1. The Transformer Architecture?

Show answer Introduced in "Attention Is All You Need" (2017), the Transformer eliminates recurrence in favor of **self-attention**, processing all tokens in parallel.

Key components:
- **Self-attention**: Each token attends to all others, capturing dependencies regardless of distance
- **Positional encoding**: Sinusoidal or learned embeddings inject word-order information
- **Encoder-decoder structure**: Encoder processes input; decoder generates output; both use self-attention + feed-forward layers
\

2. FAISS and Its Applications?

Show answer FAISS (Facebook AI Similarity Search) is a library for efficient similarity search and clustering of dense vectors.

Key points:
- **Index types**: Flat (exact), IVFFlat (approximate with clusters), HNSW (graph-based), PQ (compressed)
- **ANN search**: Limits search to nearby clusters/graph neighbors for speed, trading some accuracy
- **Key parameters**: `nlist` (number of clusters), `nprobe` (clusters searched per query) — tune for speed/accuracy tradeoff
- **GPU support**: FAISS can run on NVIDIA GPUs for significant speedup\

3. Transformer Architectures?

Show answer Key Transformer architecture concepts:

- **Components**: Encoder/decoder layers, multi-head attention, feed-forward networks, layer norm + residual connections
- **Self-attention**: Weighted sum of values based on query-key compatibility; captures long-range dependencies
- **Positional encoding**: Sinusoidal or learned embeddings to inject sequence order
- **Architecture variants**: Encoder-only (BERT — understanding), Decoder-only (GPT — generation), Encoder-decoder (T5 — seq2seq)
- **Variable-length input**: Handled via padding tokens and attention masks\

4. Advanced Explanation?

Show answer FAISS uses advanced indexing for efficient similarity search:

- **IVF (Inverted File Index)**: Partitions vector space into Voronoi cells; narrows search to nearest cells
- **Product Quantization (PQ)**: Decomposes vectors into subvectors, quantizes each separately for compact RAM storage
- **HNSW graph**: Multi-layer navigable small-world graph for fast traversal to nearest neighbors

The most accurate combination is IVF+PQ. These techniques enable state-of-the-art similarity search for semantic search, recommendations, and content retrieval.