I'm trying to implement Top2Vec on Colab. The following code is working fine with the dataset "https://raw.githubusercontent.com/wjbmattingly/bap_sent_embedding/main/data/vol7.json" available here.
But when I'm using the dataset "abcnews-date-text.csv", Colab is running endlessly. Any idea to resolve this please.
# Extract the text data from the datasetdocuments = data['headline_text'].tolist()# Initialize Top2Vec modeltop2vec_model = Top2Vec(documents, embedding_model="distiluse-base-multilingual-cased")# Get the number of topicsnum_topics = 5 # You can adjust this number according to your preference# Get the top topicstop_topics = top2vec_model.get_topics(num_topics)# Print the top topicsfor i, topic in enumerate(top_topics): print(f"Topic {i+1}: {', '.join(topic)}")