Skip to content

DL Notes

A focused knowledge base for modern deep learning systems, from hardware fundamentals to LLM serving and neural rendering.

Start Here

  1. Hardware - GPU/NPU basics, memory systems, and practical GPU configuration
  2. Tensor Operations - core tensor manipulations, activation functions, and CUDA Graph notes
  3. AI Infra Metrics - TTFT, TPOT, and TPS
  4. KV Cache - KV memory semantics, paged blocks, and prefix reuse
  5. Serving Runtime - chunked prefill, admission control, and runtime-side stability
  6. Parallelism - DP and TP from the perspective of memory, throughput, and communication
  7. Decoding and Sampling - sampling policies and speculative decoding
  8. Training Objective - autoregressive pre-training objective
  9. Models - model-specific notes (Qwen3-Omni, DFlash) and practical serving commands
  10. Neural Graphics - NeRF and Flow Matching foundations

Documentation Map

Systems and Infrastructure

AI Infra

Models

Graphics and Generative Modeling

Scope

This site emphasizes practical understanding:

  • concise theory with equations where useful
  • implementation-minded notes and runnable snippets
  • serving and performance considerations for real workloads