Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 14185

Is it ok if keras model.fit actual steps per epoch is different from the value of steps_per_epoch reported by the fit method?

$
0
0

I'm using a Data Generator that feeds a different part of the training set in each epoch. Each part has a different number of observations, so the steps per epoch changes each time on_epoch_end is called. The value stored for steps_per_epoch stays the same for all epochs (the value calculated for the first epoch) but the actual number of steps is different from this value:

Epoch 1/35/5 [==============================] - 18s 4s/step - loss: 265.6053 - auc: 0.6452 - val_loss: 0.6685 - val_auc: 0.8281Epoch 2/35/5 [==============================] - 32s 7s/step - loss: 0.7178 - auc: 0.7595 - val_loss: 0.6427 - val_auc: 0.8443Epoch 3/35/5 [==============================] - 15s 3s/step - loss: 0.6770 - auc: 0.6119 - val_loss: 0.6347 - val_auc: 0.8369

The 5/5 you see in this output is the end result, but while the fit method is running, those numbers sometimes reach a higher value than 5 and others a lower number: 8/5 or 4/5.

Is this a problem I should worry about? Or I can disregard the reported value for steps_per_epoch and assume the fit method is doing what it should?

This is my data generator, where the data is loaded from 4 different numpy files, a different file at each epoch:

class DataGenerator(keras.utils.Sequence):'Generates data for Keras'    def __init__(self, path_to_labels, path_to_data, batch_size=32, n_classes=2, shuffle=True):''' Initialization'''        self.n_channels = 5        # path_to_data is a list of 4 paths        self.path_to_data = path_to_data        self.j = -1        self.labels = {}        self.labels[0] = np.load(path_to_labels[0])        self.labels[1] = np.load(path_to_labels[1])        self.labels[2] = np.load(path_to_labels[2])        self.labels[3] = np.load(path_to_labels[3])        self.batch_size = batch_size        self.n_classes = n_classes        self.shuffle = shuffle        self.on_epoch_end()    def __len__(self):'Denotes the number of batches per epoch'        return int(np.ceil(self.len / self.batch_size))    def __getitem__(self, index):'Generate one batch of data'        # Generate indexes of the batch        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]        # Generate data        X, y = self.__data_generation(indexes)        return X, y    def get_dim(self):'Dimensions for the input layer.'        return (self.dim[0], self.dim[1], self.n_channels)    def on_epoch_end(self):'Updates indexes after each epoch'        self.j = self.j + 1        if self.j == 4:            self.j = 0        self.data = np.load(self.path_to_data[self.j])        self.len = self.data.shape[0]        self.dim = (self.data.shape[1], self.data.shape[2])        self.indexes = np.arange(self.len)        if self.shuffle == True:            np.random.shuffle(self.indexes)    def __data_generation(self, indexes):'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)        # Initialization        true_size = len(indexes)        X = np.empty((true_size, *self.dim, self.n_channels))        y = np.empty((true_size), dtype=float)        # Generate data        for i, idx in enumerate(indexes):            X[i,:,:,:] = self.data[idx, :, :, :]            # Store solution            y[i] = self.label[j][idx]        return X, keras.utils.to_categorical(y, num_classes=self.n_classes)

I couldn't find a way to test if this is working as expected yet. I fit the model and everything works fine, except for the counter of the current step in each epoch, which could end in a lower or higher value than the steps_per_epoch value stored by keras.


Viewing all articles
Browse latest Browse all 14185

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>