I have a dataset containing 300 samples, but I want to make new samples (so i get N=100 000) based on these existing ones. I have tried this:
# These lists are just example-lists, not the actual data - being of length 300
masses = [0.8, 3.4, 2.1, 4.0, ...]
areas = [0.014, 0.12, 0.0071, 0.004, ...]
kde_mass = gaussian_kde(masses)
kde_area = gaussian_kde(areas)
mass_samples_kde = kde_mass.resample(N)[0]
areas_samples_kde = kde_area.resample(N)[0]
random_indices = np.random.choice(len(mass_samples_kde), size=N, replace=True)
mass_distribution = mass_samples_kde[random_indices]
area_distribution = areas_samples_kde[random_indices]
but I'm not sure whether this is correct. It is also important that when choosing the values randomly, that the values gets chosen with the same index, as the area and mass corresponds. Per now I also get negative values, which i don't want as my area and mass obviously cannot be negative.
Does anyone know of any way to sample on top of existing data, and a way to clip out negatives and zeroes while doing so?