Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13981

Multil-class, Multi-label Object Detection using Tensorflow Keras in Google Colab

$
0
0

My goal is the following:I want to train an Object Detectino model, which can classify multiple classes within an image.Each of the classes can appear a varying number of times. It is also possible for different classes to appear within one image.

This is the model i am currently using:

def create_model(inShape, num_classes, optimizer):    input_layer = layers.Input(shape=inShape)    x = layers.Conv2D(32, (3, 3), activation='relu')(input_layer)    x = layers.BatchNormalization()(x)    x = layers.MaxPooling2D((2, 2))(x)    x = layers.Conv2D(64, (3, 3), activation='relu')(x)    x = layers.BatchNormalization()(x)    x = layers.MaxPooling2D((2, 2))(x)    x = layers.Conv2D(128, (3, 3), activation='relu')(x)    x = layers.BatchNormalization()(x)    x = layers.GlobalAveragePooling2D()(x)    x = layers.Dense(128, activation='relu')(x)    x = layers.BatchNormalization()(x)    x = layers.Flatten()(x)    class_probs_layer = layers.Dense(4, activation='softmax', name='class_Probs')(x)    bbox_layer = layers.Dense(4, activation='linear', name='bboxes')(x)    model = Model(inputs = input_layer, outputs=[class_probs_layer, bbox_layer])    #Model kompilieren    model.compile(optimizer=optimizer,          loss={'class_Probs': 'categorical_crossentropy','bboxes': 'mean_squared_error'},          metrics={'class_Probs': 'accuracy','bboxes': 'mean_squared_error'})    return model

The output of the model is one bounding box (4 coordinates) and 2 probabilities, since right now i am training on a dataset with only 2 classes (caries, black stain).P.S. the class_probs_layer has 4 outputs only for an error fix - otherwise logits and labels don't match

Please suggest different model structure of some kind.

Also, my training is based on TFRecord-Files, which i write myself. I have one question regarding those:Since there is multiple instances within in each image (but not the same number every time), there is 2 possibilities to write the TFRecords:

image Data, annotation, image Data, annotation... -> one instance for each bounding box, so multiple instances per image

OR

image Data, annotation, annotation, annotation, image Data... -> one instance per image, where the image data, which constists of image_id, filename and image_encoded, only appears once, and is followed by all the annotations (bounding boxes, category_ids) which belong to one image (image_id).


Viewing all articles
Browse latest Browse all 13981

Trending Articles