DL Notes
A focused knowledge base for modern deep learning systems, from hardware fundamentals to LLM serving and neural rendering.
Start Here
- Hardware - GPU/NPU basics, memory systems, and practical GPU configuration
- Tensor Operations - core tensor manipulations, activation functions, and CUDA Graph notes
- AI Infra Metrics - TTFT, TPOT, and TPS
- KV Cache - KV memory semantics, paged blocks, and prefix reuse
- Serving Runtime - chunked prefill, admission control, and runtime-side stability
- Parallelism - DP and TP from the perspective of memory, throughput, and communication
- Decoding and Sampling - sampling policies and speculative decoding
- Training Objective - autoregressive pre-training objective
- Models - model-specific notes (Qwen3-Omni, DFlash) and practical serving commands
- Neural Graphics - NeRF and Flow Matching foundations
Documentation Map
Systems and Infrastructure
AI Infra
Models
Graphics and Generative Modeling
Scope
This site emphasizes practical understanding:
- concise theory with equations where useful
- implementation-minded notes and runnable snippets
- serving and performance considerations for real workloads