Deep Representation Learning

This page tracks representation learning resources, attention variants, and training interpretations.

Sources in this batch

  • “Principles and Practice of Deep Representation Learning” is a book-length resource.
  • Raschka’s visual guide covers attention variants in modern LLMs.
  • DiffusionBlocks interprets block-wise neural-network training through a diffusion lens.

Research interest

The interesting thread is that representation learning is being reinterpreted through multiple lenses at once: attention variants for scaling, diffusion interpretations for training, and book-length consolidation of principles. DiffusionBlocks is worth deeper reading because it may connect generative modeling intuitions to optimization/training dynamics.

Open questions:

  • Which attention variants matter in practice versus mainly in theory?
  • Can diffusion interpretations of training produce better algorithms or only post-hoc explanations?
  • How do representation-learning principles change in agentic or multimodal settings?