python - Using tf.keras.utils.image_dataset_from_directory mismatches labels on the images

Here's the code

train_dset = image_dataset_from_directory(
    directory = data_path,
    batch_size = 32,
    image_size = (256,256),
    label_mode = "int",
    shuffle = True
)

in data_path, I have

-> dataset
   |-> train
       |-> cats (having only cats images)
       |-> dogs (having only dogs images)
   |-> test
       |-> cats (having only cats images)
       |-> dogs (having only dogs images)

when i keep shuffle = True, labels are also shuffled, such as dog image is assigned cat label and vice versa. How to club label and image together and then shuffle the data, without any label mismatch?

Here's the code

train_dset = image_dataset_from_directory(
    directory = data_path,
    batch_size = 32,
    image_size = (256,256),
    label_mode = "int",
    shuffle = True
)

in data_path, I have

-> dataset
   |-> train
       |-> cats (having only cats images)
       |-> dogs (having only dogs images)
   |-> test
       |-> cats (having only cats images)
       |-> dogs (having only dogs images)

Share Improve this question asked Feb 11 at 12:08 Chinmaya Tewari 112 bronze badges

1 Hi @Chinmaya Tewari, you can use the following command: keras.utils.image_dataset_from_directory(directory, labels="inferred") – Sagar Commented Feb 11 at 13:56
Are you using only one path? I think you need to have to make one for train and other for test data. In other words you need to create two datasets. – David Sousa Commented Feb 11 at 21:26
@DavidSousa Certainly not, I'm using two paths one for train data and other for test data. I didn't post the code for the test as it's going to be almost same, my apologies for that. – Chinmaya Tewari Commented Feb 12 at 6:30
@Sagar I've tried using labels="inferred" as well. Outcome was all the same – Chinmaya Tewari Commented Feb 12 at 6:31
Hi @ChinmayaTewari if you are using Google_colab can you provide the gist file of the code – Sagar Commented Feb 12 at 6:39

| Show 1 more comment

1 Answer 1

Sorted by: Reset to default 0

Basically you need to have a directory structure having folders specific to the labels like below

├── train/
│   ├── cats/
│   │   ├── image1.jpg
│   │   └── ...
│   ├── dogs/
│   │   ├── imageA.jpeg
│   │   └── ...
│   └── ...

Then you can specify

train_ds = image_dataset_from_directory(
    directory = "train",
    batch_size = 32,
    image_size = (256,256),
    label_mode = "int",
    shuffle = True
)

Similarly you can create test_ds, usually with shuffle = False for this.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - Using tf.keras.utils.image_dataset_from_directory mismatches labels on the images - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)