Is there a method for converting Hugging Face Transformer embeddings back to text?
Suppose that I have text embeddings created using Hugging Face's ClipTextModel using the following method:
import torchfrom transformers import CLIPTokenizer, CLIPTextModelclass_list = ["i love going home and playing with my wife and kids","i love going home","playing with my wife and kids", "family","war","writing",]model = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14")tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14")inputs = tokenizer(class_list, padding=True, return_tensors="pt")outputs = model(**inputs)hidden_state = outputs.last_hidden_stateembeddings = outputs.pooler_outputMy embeddings are in the variable "embeddings". Questions:
- Is it possible for me to convert my embeddings back to the input strings in "class_list"? To be precise: If I sent the embeddings to a person who had no foreknowledge of the list of original strings; what steps would they need to take to extract the list of the original strings?
- If so, how can I do this?