I am trying to follow this article on medium Article.
I had a few problems with it so the remain chang eI did was to the TrainingArguments object I added gradient_checkpointing_kwargs={'use_reentrant':False},.
So now I have the following objects:
peft_training_args = TrainingArguments( output_dir = output_dir, warmup_steps=1, per_device_train_batch_size=1, gradient_accumulation_steps=4, max_steps=100, #1000 learning_rate=2e-4, optim="paged_adamw_8bit", logging_steps=25, logging_dir="./logs", save_strategy="steps", save_steps=25, evaluation_strategy="steps", eval_steps=25, do_eval=True, gradient_checkpointing=True, gradient_checkpointing_kwargs={'use_reentrant':False}, report_to="none", overwrite_output_dir = 'True', group_by_length=True,)peft_model.config.use_cache = Falsepeft_trainer = transformers.Trainer( model=peft_model, train_dataset=train_dataset, eval_dataset=eval_dataset, args=peft_training_args, data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),)And when I call peft_trainer.train() I get the following error:
AttributeError: 'torch.dtype' object has no attribute 'itemsize'I'm using Databricks, and my pytorch version is 2.0.1+cu118