Quantcast
Viewing all articles
Browse latest Browse all 14389

sampling and feature selection on RCV1 dataset

I am interested in applying classification algorithms such as KNN, GP, MLP, etc., on the RCV1 dataset for topic classification. However, this dataset is quite large, with dimensions of (804414, 47236) for the data and (804414, 103) for the target. Additionally, a significant portion of the data contains zeros.each time that i try trained the model i get memory Error or Outliers and unrelated data.I use python in google colab.To make these algorithms run easier, I am considering employing methods like sampling or feature selection. I would appreciate guidance on how to do this and witch techniques are efficient?

Thank you!

best techniques to reduce RCV1 dimension.


Viewing all articles
Browse latest Browse all 14389

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>