I think it's a pretty common message for PyTorch users with low GPU memory:
RuntimeError: CUDA out of memory. Tried to allocate X MiB (GPU X; X GiB total capacity; X GiB already allocated; X MiB free; X cached)
I tried to process an image by loading each layer to GPU and then loading it back:
for m in self.children(): m.cuda() x = m(x) m.cpu() torch.cuda.empty_cache()
But it doesn't seem to be very effective. I'm wondering is there any tips and tricks to train large deep learning models while using little GPU memory.