I'm researcher and want to predict how data from instrument will be after more than 10k cycles. I got data from 10,000 cycles and now would like to make a precise prediction. With the simplest version I get repopulation (copy of historical data in window):
for catalyst, currents in max_on_currents.items():
cycles = range(start_cycle, num_cycles)
plt.plot(cycles, currents, label=f'{catalyst}', alpha=0.7)
try:
# --- Rolling Window Prediction ---
# Use the last N cycles to predict the future
window_size = 500 # Number of recent cycles to use for prediction
if len(currents) < window_size:
window_size = len(currents) # Adjust if data is smaller than window
# Extract the most recent window of data
recent_data = currents[-window_size:]
# Repeat the recent data to create predictions
repeat_times = (pred_cycles // window_size) + 1
predicted = np.tile(recent_data, repeat_times)[:pred_cycles]
# Add some noise to avoid exact repetition
noise = np.random.normal(0, np.std(recent_data) * 0.1, len(predicted)) # 10% noise
predicted = predicted + noise
predicted = np.maximum(predicted, 0) # Ensure non-negative values
except Exception as e:
print(f"Error in {catalyst}: {str(e)}")
continue
# Plot predictions
pred_range = range(num_cycles, num_cycles + pred_cycles)
plt.plot(pred_range, predicted, linestyle='--', lw=1.5,
label=f'{catalyst} Prediction')
This gives me an exact copy. Later, I will try with LSTM, but it is taking too long to run, and I don't see any results after a couple of minutes.
My final goal is to have the same data peak as I have before, not any lines (ARIMA or THETA makes lines). I have many CSV files to analyse, so it will be best to work universally as a Python script.