Here's the code
train_dset = image_dataset_from_directory(
directory = data_path,
batch_size = 32,
image_size = (256,256),
label_mode = "int",
shuffle = True
)
in data_path, I have
-> dataset
|-> train
|-> cats (having only cats images)
|-> dogs (having only dogs images)
|-> test
|-> cats (having only cats images)
|-> dogs (having only dogs images)
when i keep shuffle = True, labels are also shuffled, such as dog image is assigned cat label and vice versa. How to club label and image together and then shuffle the data, without any label mismatch?
Here's the code
train_dset = image_dataset_from_directory(
directory = data_path,
batch_size = 32,
image_size = (256,256),
label_mode = "int",
shuffle = True
)
in data_path, I have
-> dataset
|-> train
|-> cats (having only cats images)
|-> dogs (having only dogs images)
|-> test
|-> cats (having only cats images)
|-> dogs (having only dogs images)
when i keep shuffle = True, labels are also shuffled, such as dog image is assigned cat label and vice versa. How to club label and image together and then shuffle the data, without any label mismatch?
Share Improve this question asked Feb 11 at 12:08 Chinmaya TewariChinmaya Tewari 112 bronze badges 6- 1 Hi @Chinmaya Tewari, you can use the following command: keras.utils.image_dataset_from_directory(directory, labels="inferred") – Sagar Commented Feb 11 at 13:56
- Are you using only one path? I think you need to have to make one for train and other for test data. In other words you need to create two datasets. – David Sousa Commented Feb 11 at 21:26
- @DavidSousa Certainly not, I'm using two paths one for train data and other for test data. I didn't post the code for the test as it's going to be almost same, my apologies for that. – Chinmaya Tewari Commented Feb 12 at 6:30
- @Sagar I've tried using labels="inferred" as well. Outcome was all the same – Chinmaya Tewari Commented Feb 12 at 6:31
- Hi @ChinmayaTewari if you are using Google_colab can you provide the gist file of the code – Sagar Commented Feb 12 at 6:39
1 Answer
Reset to default 0Basically you need to have a directory structure having folders specific to the labels like below
├── train/
│ ├── cats/
│ │ ├── image1.jpg
│ │ └── ...
│ ├── dogs/
│ │ ├── imageA.jpeg
│ │ └── ...
│ └── ...
Then you can specify
train_ds = image_dataset_from_directory(
directory = "train",
batch_size = 32,
image_size = (256,256),
label_mode = "int",
shuffle = True
)
Similarly you can create test_ds
, usually with shuffle = False
for this.