
Prediction of research data easiest method in python - Stack Overflow


I'm researcher and want to predict how data from instrument will be after more than 10k cycles. I got data from 10,000 cycles and now would like to make a precise prediction. With the simplest version I get repopulation (copy of historical data in window):

for catalyst, currents in max_on_currents.items():
    cycles = range(start_cycle, num_cycles)
    plt.plot(cycles, currents, label=f'{catalyst}', alpha=0.7)

        # --- Rolling Window Prediction ---
        # Use the last N cycles to predict the future
        window_size = 500  # Number of recent cycles to use for prediction
        if len(currents) < window_size:
            window_size = len(currents)  # Adjust if data is smaller than window

        # Extract the most recent window of data
        recent_data = currents[-window_size:]

        # Repeat the recent data to create predictions
        repeat_times = (pred_cycles // window_size) + 1
        predicted = np.tile(recent_data, repeat_times)[:pred_cycles]

        # Add some noise to avoid exact repetition
        noise = np.random.normal(0, np.std(recent_data) * 0.1, len(predicted))  # 10% noise
        predicted = predicted + noise
        predicted = np.maximum(predicted, 0)  # Ensure non-negative values

    except Exception as e:
        print(f"Error in {catalyst}: {str(e)}")

    # Plot predictions
    pred_range = range(num_cycles, num_cycles + pred_cycles)
    plt.plot(pred_range, predicted, linestyle='--', lw=1.5,
             label=f'{catalyst} Prediction')

This gives me an exact copy. Later, I will try with LSTM, but it is taking too long to run, and I don't see any results after a couple of minutes.

My final goal is to have the same data peak as I have before, not any lines (ARIMA or THETA makes lines). I have many CSV files to analyse, so it will be best to work universally as a Python script.



  1. 暂无评论