Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 14126

Run pre-trained LLM model on CPU - ValueError: Expected a cuda device, but got: cpu

$
0
0

I am using a LLM model, CohereForAI/c4ai-command-r-plus-4bit, to do some inference. I have a GPU but it's not powerful enough so I want to use CPU. Below are the example codes and problems.

Code:

from transformers import AutoTokenizer, AutoModel, AutoModelForCausalLMPRETRAIN_MODEL = 'CohereForAI/c4ai-command-r-plus-4bit'tokenizer = AutoTokenizer.from_pretrained(PRETRAIN_MODEL)model = AutoModelForCausalLM.from_pretrained(PRETRAIN_MODEL, device_map='cpu')text = "this is an example"inputs = tokenizer(text, return_tensors="pt")with torch.no_grad():    outputs = model(**inputs)    embedding = outputs.last_hidden_state.mean(dim=1).squeeze().numpy()print(embedding.shape)

Error:

ValueError: Expected a cuda device, but got: CPU

Does it mean that the c4ai-command-r-plus-4bit model can only run on GPU? Is there anything I missed to run it on CPU? Thanks!


Viewing all articles
Browse latest Browse all 14126

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>