I'm working on a NLP model (ising LSTMs) that takes in a text sentence (e.g. google maps reviews) and predicts star ratings.
I got hold of the yelp dataset that has over six million reviews and I'm training my model on it. The dataset is too large which made me train my model in batches (10K review batches).
Let's say that I got done with trainings and I achieved reasonable performance. Then, Yelp released 1 million new reviews. How do I go about training model on the new data?
1- Should I train the model on the new data alone ?
Or
2- combine the new data with the old data and retrain the model ?
3- How do I avoid catastrophic forgetting?