Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 18819

Pytorch reserving way more data than needed

$
0
0

I'm trying to finetune a sentencetransformer.The issue is that I'm running into an OOM-error (I'm using google-cloud to train the model).

I keep getting that pytorch reserves ~13GB (theres ~14GB) available thus theres no room for any batch.

If I try to calcuate the actual memory used, it's around 1.3GB

from sentence_transformers import modelsmodel_name = "alexandrainst/scandi-nli-large"word_embedding_model = models.Transformer(model_name, max_seq_length=512)pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())model = SentenceTransformer(modules=[word_embedding_model, pooling_model])model.to("cuda")param_size = 0for param in model.parameters():    param_size += param.nelement() * param.element_size()buffer_size = 0for buffer in model.buffers():     buffer_size += buffer.nelement() * buffer.element_size()size_all_mb = (param_size + buffer_size) / 1024 ** 2print('model size: {:.3f}MB'.format(size_all_mb)) # ~1300torch.cuda.memory_reserved()/(1024**2) # ~1300

I have tried to call torch.cuda.empty_cache() and set os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128" but the same error occurs.

Isn't there a way to don't make pytorch reserve memory (or atleast reduce it) but just use the memory which is needed?


Viewing all articles
Browse latest Browse all 18819

Trending Articles