I trained a CNN for emotion recognition and used two different transformation pipelines for image preprocessing:
Simple Transformation:
TRANSFORM = transforms.Compose([
transforms.Resize((64, 64)),
transforms.Grayscale(num_output_channels=1),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485], std=[0.229])
])
Extended Data Augmentation:
TRANSFORM = transforms.Compose([
transforms.Resize((64, 64)),
transforms.Grayscale(num_output_channels=1),
transforms.RandomHorizontalFlip(p=0.5),
transforms.RandomAffine(degrees=5, translate=(0.1, 0.1), scale=(0.9, 1.1), fill=0),
transforms.GaussianBlur(kernel_size=(3,3), sigma=(0.1, 1.0)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.5], std=[0.5]),
])
I trained my model with both transformations separately and obtained two different accuracy curves for training and validation.
What key factors should I consider when interpreting the differences in these accuracy curves? Simple_Transformation_Curve Augmenatation_Curve
I trained a CNN for emotion recognition and used two different transformation pipelines for image preprocessing:
Simple Transformation:
TRANSFORM = transforms.Compose([
transforms.Resize((64, 64)),
transforms.Grayscale(num_output_channels=1),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485], std=[0.229])
])
Extended Data Augmentation:
TRANSFORM = transforms.Compose([
transforms.Resize((64, 64)),
transforms.Grayscale(num_output_channels=1),
transforms.RandomHorizontalFlip(p=0.5),
transforms.RandomAffine(degrees=5, translate=(0.1, 0.1), scale=(0.9, 1.1), fill=0),
transforms.GaussianBlur(kernel_size=(3,3), sigma=(0.1, 1.0)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.5], std=[0.5]),
])
I trained my model with both transformations separately and obtained two different accuracy curves for training and validation.
What key factors should I consider when interpreting the differences in these accuracy curves? Simple_Transformation_Curve Augmenatation_Curve
Share Improve this question asked Feb 4 at 11:13 Sk8Sk8 132 bronze badges1 Answer
Reset to default 1First a few remarks that can help the analysis in general:
- use same scales for both methods
- use logarithmic y axis for the loss, to have a better idea of the trend
From the two curves I would say that you should consider:
- the gap between training and validation: you "simple transform" has a big gap, which suggests that your training dataset does not represent well all the data (it could also be overfitting, but usually overfitting is identified by an increasing validation loss)
- the loss curve: the "simple transform" does not seem to improve after 25000 steps, while the "augmentation" case could still improve (the loss isn't flat at the end)
- the trend in the loss curve: in a (x, log(y)) graph, it is easy to interpret the convergence rate (exponential, quadratic, ...). Here your augmentation seems to have a different convergence rate (not just a different factor), so it might take more epochs just to reach the same loss level.
- of course the value of the accuracy, even though it does not really say why a model performs better, of what could be improved
Finally, there are a lot of other metrics you could have a look at, to diagnose your models: true positive, true negative, precision, recall, f1-score, area under ROC curve... Here is a link to one of the numerous websites explaining these.