Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13891

How to do hyper-parameter tuning with panel data in sklearn framework?

$
0
0

Imagine we have multiple time-series observations for multiple entities, and we want to perform hyper-parameter tuning on a single model, splitting the data in a time-series cross-validation fashion.

To my knowledge, there isn't a straightforward solution to performing this hyper-parameter tuning operation within the scikit-learn framework. There exists the functionality to do this with a single time-series using TimeSeriesSplit, however this doesn't work for multiple entities.

As a simple example imagine we have a dataframe:

from itertools import product# create a dataframecountries = ['ESP','FRA']periods = list(range(10))df = pd.DataFrame(list(product(countries,periods)), columns = ['country','period'])df['target'] = np.concatenate((np.repeat(1, 10), np.repeat(0, 10)))df['a_feature'] = np.random.randn(20, 1)# this produces the following dataframe:country,period,target,a_featureESP,0,1,0.08ESP,1,1,-2.0ESP,2,1,0.1ESP,3,1,-0.59ESP,4,1,-0.83ESP,5,1,0.05ESP,6,1,0.05ESP,7,1,0.42ESP,8,1,0.04ESP,9,1,2.17FRA,0,0,-0.44FRA,1,0,-0.48FRA,2,0,0.82FRA,3,0,-1.64FRA,4,0,0.19FRA,5,0,0.6FRA,6,0,-0.73FRA,7,0,-0.5FRA,8,0,1.11FRA,9,0,-0.75

And we want to train a single model across Spain and France so that we take all the data up to a certain period, and then predict using that trained model the next period for both Spain and France. And we want to assess which set of hyper-parameters work best for performance.

How to do hyper-parameter tuning with panel data in an time-series cross-validation framework?

Similar questions have been asked here:


Viewing all articles
Browse latest Browse all 13891

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>