I'm trying to finetune a sentencetransformer.The issue is that I'm running into an OOM-error (I'm using google-cloud to train the model).
I keep getting that pytorch reserves ~13GB (theres ~14GB) available thus theres no room for any batch.
If I try to calcuate the actual memory used, it's around 1.3GB
from sentence_transformers import modelsmodel_name = "alexandrainst/scandi-nli-large"word_embedding_model = models.Transformer(model_name, max_seq_length=512)pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())model = SentenceTransformer(modules=[word_embedding_model, pooling_model])model.to("cuda")param_size = 0for param in model.parameters(): param_size += param.nelement() * param.element_size()buffer_size = 0for buffer in model.buffers(): buffer_size += buffer.nelement() * buffer.element_size()size_all_mb = (param_size + buffer_size) / 1024 ** 2print('model size: {:.3f}MB'.format(size_all_mb)) # ~1300torch.cuda.memory_reserved()/(1024**2) # ~1300
I have tried to call torch.cuda.empty_cache()
and set os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"
but the same error occurs.
Isn't there a way to don't make pytorch reserve memory (or atleast reduce it) but just use the memory which is needed?