Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

Error When Fitting Model in TensorFlow: logits and labels must have the same first dimension

$
0
0

I am new to using TensorFlow, but I am trying to build an LSTM Model using tokenized strings as inputs.

# Define the modeldef build_lstm_model():    input_ids = tf.keras.layers.Input(shape=(128,), dtype=tf.int32, name='input_ids')    # BERT embedding layer    bert_model = TFBertModel.from_pretrained('bert-base-uncased')    bert_output = bert_model(input_ids)[1]  # using the pooled output    # LSTM layer    lstm_output = tf.keras.layers.LSTM(64)(tf.expand_dims(bert_output, axis=1))  # Expand the dimensions for LSTM    # Output layer    output = tf.keras.layers.Dense(1, activation='softmax')(lstm_output)    model = tf.keras.Model(inputs=input_ids, outputs=output)    model.compile(optimizer='adam',                  loss='sparse_categorical_crossentropy',                  metrics=['accuracy'])    return model# Train the modelmodel = build_lstm_model()#model.summary()model.fit(train_dataset, validation_data=val_dataset, epochs=3)

Trying to fit this model returns the error:

logits and labels must have the same first dimension, got logits shape [8,1] and labels shape [1024]

More notes:'train_dataset' and 'val_dataset' are both TensorFlow datasets with shape=(None,128).

Here is the output of model.summary():

Model: "model"_________________________________________________________________ Layer (type)                Output Shape              Param #   ================================================================= input_ids (InputLayer)      [(None, 128)]             0          tf_bert_model (TFBertModel  TFBaseModelOutputWithPo   109482240  )                           olingAndCrossAttentions                                          (last_hidden_state=(Non                                          e, 128, 768),                                                     pooler_output=(None, 7                                          68),                                                              past_key_values=None,                                           hidden_states=None, att                                          entions=None, cross_att                                          entions=None)                        tf.expand_dims (TFOpLambda  (None, 1, 768)            0          )                                                                lstm (LSTM)                 (None, 64)                213248     dense (Dense)               (None, 1)                 65        =================================================================Total params: 109695553 (418.46 MB)Trainable params: 109695553 (418.46 MB)Non-trainable params: 0 (0.00 Byte)_________________________________________________________________

I can't figure out how to get the dimensions to match so I can start training my model.

I tried the following solutions after reading other StackOverflow posts:

  • Flattening the LSTM layer
lstm_output = tf.keras.layers.Flatten()(lstm_output)
  • Changing the output shape of the LSTM layer and Dense layer.
  • Changing the loss function to 'categorical_crossentropy'.
  • Changing the output layer activation to 'sigmoid'.

Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>