Viewing a single comment thread. View all comments

Ragdoll_X_Furry t1_iwc23i9 wrote

A few more details about your implementation would be useful for us to help you.

  1. How many images are you using for validation?

  2. What batch size and optimizer are you using during training?

  3. What's the dropout rate in the Dropout layers?

  4. How are you preprocessing the images before feeding them to your model? Are you using the tf.keras.applications.densenet.preprocess_input function as suggested in the Keras documentation?

You should try increasing the batch size if you can, and use data augmentation as others have already suggested.

You can also try other networks besides DenseNet, like one of the ResNet or EfficientNet models, and you can replace the Flatten layer by a GlobalAvgPool2D or GlobalMaxPool2D layer to reduce parameter size (in my experience the former gives better results). Also that resizing layer might not necessary to improve accuracy.

1

Tiny-Mud6713 OP t1_iwcglrg wrote

1- I'm doing a 20% split, so in total they're around 2800, 700 training and validation.

2- batches of 8, Adam with LR=0.001 in the transfer part, LR=0.0001 in the fine tuning, any other combination caused everything to crumble.

3- currently 0.3, 0.5 caused some early stopping problems, since the model was stuck

4- valid_data_gen = ImageDataGenerator(rescale=1/255.)

train_data_gen = ImageDataGenerator(

rescale=1/255.,

rotation_range = 30,

width_shift_range = 0.2,

height_shift_range = 0.2,

horizontal_flip = True,

vertical_flip = True

)

​

and then flow from file to get the preprocessed images

1

Ragdoll_X_Furry t1_iwcxiv6 wrote

Adam is usually more likely to overfit, so using SGD with Nesterov momentum might help a bit. I'd also recommend augmenting contrast, brightness, saturation and hue if those options are available for the ImageDataGenerator class.

Also does the rotation in the ImageDataGenerator fill the background with black pixels or is there the option to extend/reflect the image? In my experience simply filling the background with black after rotation tends to hinder the accuracy.

One trick that might also help is to extract the outputs not only from the last layer of the pretrained network but also from earlier layers to feed into your network. In my experience this can help improve the accuracy. I've done this with the EfficientNet B0, so I've pasted some example code here to help you out, though if you don't want to use an EfficientNet I'm sure this can be adapted to the DenseNet201 too.

Of course, sometimes transfer learning just doesn't help really, so if nothing else helps you push the accuracy above 90% it might be best to just build and train your own model from scratch to better suit your needs.

2

Tiny-Mud6713 OP t1_iwd0cqz wrote

I haven't tried playing with the optimizer, thank you for the notice, also thanks for the code, will try to play around with it too :)

1

Tiny-Mud6713 OP t1_iwcgqff wrote

Actually the resizing really boosted the performance by like 5%, I'm at at 80% now, but still looking to boost it up

1