Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 14040

GAE is very slow loading a sentence transformer

$
0
0

I'm using Google App Engine to host a website using Python and Flask.

I need to add text similarity functionality, using sentence_transformers. In requirements.txt, I add a dependency to the cpu version of torch:

torch @ https://download.pytorch.org/whl/cpu/torch-2.2.1%2Bcpu-cp311-cp311-linux_x86_64.whl sentence-transformers==2.4.0

When I add these statements to the main.py file:

from sentence_transformers import SentenceTransformermodel = SentenceTransformer('all-MiniLM-L6-v2')

the GAE instance creation time degrades from < 1 sec to > 20 sec.

Performance improves if I save the model to a directory in the project and use:

model = SentenceTransformer('./idp_web_server/model')

but it is still over 15 sec. (Removing the statement for model creation reduces instance creation time to 4 sec). Going from an F4 instance (2.4 GHZ, with automatic scaling) to a B8 instance (4.8 MHZ, basic scaling) instance does not improve performance, so, it seems to be IO bound. Running the app locally on my machine (2.4 GHz), the model creation takes only 1.7 sec, i.e., is 5 to 10 times faster.

Can this be improved? Should I move to Google Cloud instead of GAE?


Viewing all articles
Browse latest Browse all 14040

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>