Originally posted by StillStuckOnSI
View Post
You definitely need at least fp16 (bf16 is much better), for training. The other thing training tends to like is large memory. For the larger models, you also want multiple GPUs and fast interconnects. Those are the things which distinguish training-oriented vs. inference-oriented GPUs.
Comment