I'm trying to load quantization like
from transformers import LlamaForCausalLMfrom transformers import BitsAndBytesConfigmodel = '/model/'model = LlamaForCausalLM.from_pretrained(model, quantization_config=BitsAndBytesConfig(load_in_8bit=True))but I get the error
ImportError: Using `load_in_8bit=True` requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes `pip install -i https://test.pypi.org/simple/ bitsandbytes` or pip install bitsandbytes` But I've installed both, and I get the same error. I shut down and restarted the jupyter kernel I was using this on.