Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

LiteLLM and Llama-Index Service Context creation

$
0
0

I am going crazy with finding a solution to this. I am working with a proxy server for OpenAI models. I'm using an ssh tunnel to hit the server on my localhost.

from openai import OpenAIclient = OpenAI(base_url="http://localhost:8000", api_key="sk-xxx")response = client.chat.completions.create(model="gpt_35_turbo", messages = [    {"role": "user","content": "this is a test request, write a short poem"    }])print(response)

This works perfectly. I need to use this with llamaindex, so I need to define my llm and embedding_model, and serve them to my service_context like so:

llm = OpenAI(model="text-davinci-003", temperature=0, max_tokens=256)embed_model = OpenAIEmbedding()text_splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=20)prompt_helper = PromptHelper(    context_window=4096,    num_output=256,    chunk_overlap_ratio=0.1,    chunk_size_limit=None,)service_context = ServiceContext.from_defaults(    llm=llm,    embed_model=embed_model,    text_splitter=text_splitter,    prompt_helper=prompt_helper,)

where I would be calling my own "ada-02" model through the server as well. How can I make this work with my setup? I am totally unable to find any answer to this anywhere and I've already wasted days trying to fix it.


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>