I am trying to generate a bigger dataset of sattelite images from Google Earth Engine or Sentinel-2 data depicting aggregated litter.
For this purpose I have found that I can use a GAN and stable diffusion so I am willing to try out both to see which one generates better data.
Could you help me understand if my plan is going to work?
I only have 5 satellite images showing litter, which to the naked eye it is not very clear. For example, see the attached enter image description here
I plan to feed this to a stable diffusion model while utilising fine-tuning(text-to-speech) to enhance the dataset. Then, with the same 5 initial images feed them into a GAN model.
As a way to validate the dataset generated, I need to be able to set some type of 'standards'. For this I would use a YOLO-v8 as a pre-trained model as generate the loss and accuracy based on the data from the GAN and then on the data of the stable diffusion.
Would this be a correct approach?
Appreciate you help :)