Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

Using joblib to dump a scikit-learn model on x86 then read on z/OS passes in Decision Tree but fails on a GradientBoostingRegressor

$
0
0

I have made some small tweaks into NumpyArrayWrapper in numpy_pickle.py to allow a decision tree model to be successfully loaded onto scikit-learn running on z/OS. The change boils down to checking if byte-order is correct and if not call array.byteswap(). However when trying to load a GradientBoostingRegressor model it fails before even reaching the byteswap fix.

The error is coming from this line https://github.com/scikit-learn/scikit-learn/blob/0.18.1/sklearn/tree/_tree.pyx#L644 and it is because of the following condition node_ndarray.dtype != NODE_DTYPE. The reason This is happening because the Gradient Boost Regressor hits the following line of code while Decision Tree does not https://github.com/scikit-learn/scikit-learn/blob/0.18.1/sklearn/externals/joblib/numpy_pickle.py#L105

I was wondering if anyone knows if there is something different I should be doing because the Dtypes for the DT model seem to be fine when loading on z/OS however they are not for the GBR model. This appears to be coming from the model.fit method because when removing that call I can successfully load the pkl file on z/OS.

Code for training the Gradiant Boosting Model

from sklearn.ensemble import GradientBoostingRegressorfrom sklearn.pipeline import Pipelinefrom sklearn import datasetsfrom sklearn import metricsfrom sklearn.tree import DecisionTreeClassifierfrom sklearn.model_selection import train_test_splitiris = datasets.load_iris()X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)gbr = GradientBoostingRegressor(max_depth=3)model = Pipeline([('Gbr', gbr)])model.fit(X_train, y_train)from sklearn.externals import joblibjoblib.dump(model, 'GBTmodelx86.pkl')

Code for training the Decision Tree model

from sklearn import datasetsfrom sklearn import metricsfrom sklearn.tree import DecisionTreeClassifierdataset = datasets.load_iris()model = DecisionTreeClassifier()model.fit(dataset.data, dataset.target)from sklearn.externals import joblibjoblib.dump(model, 'DTmodelX86.pkl')

Code for loading each model

from sklearn.externals import joblibmodel = joblib.load('DTmodelX86.pkl') 
from sklearn.externals import joblibmodel = joblib.load('GBTmodelx86.pkl') 

Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>