Articles tagged: "muon-optimizer"
-
Training a Portrait DiT on a Single GPU: What the Ablation Study Taught Us
The prevailing assumption in generative AI is that training a large, multi-modal Diffusion Transformer from scratch requires a cluster. prx-tg is a direct challenge to that assumption: a 400M+ parameter...