最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - How to find the most common frequeny in Time series - Stack Overflow

programmeradmin1浏览0评论

I have a TimeSeries shown below. We can clearly see that there is a cyclical behavior in the data, where rise and fall happens every 12 months and so on, irrespective of if trend is rising/falling. I am struggling to find how to extract this frequency of 12 from the data using scipy.fft python library. Mostly every time repetition happens around 12 months, but it can be around 11 months as well or 13 months, but 12 is the most common frequency. I am using using Fourier transformation, but can't figure out how to extract this most common frequency (12) using some python library?

Thanks

Here is the data:

11130.578853063385,
6509.723592808,
5693.928982796129,
3415.099921325464,
-9299.291673374173,
-3388.284173885658,
-5577.9316298032745,
-3509.583232678111,
2285.99065857961,
3844.3061166856014,
-7383.526882168155,
-4622.792125020905,
2813.586128745183,
-1501.9405075290495,
8911.954971658348,
7800.444458471794,
-1190.7952377053866,
4768.791467877524,
2173.593871988719,
-2420.04786197912,
-2304.842777539399,
-3562.1788525161837,
-8766.208378658988,
-7655.936603573945,
-5890.9298543619125,
-9628.764012284291,
12124.740839506767,
12391.257220312522,
7512.253051850619,
12921.032383220418,
10063.270097922614,
-1350.2599176773563,
-6887.434936788105,
-11116.26528794868,
-10196.871058658888,
-10874.172006461778,
-15014.190086779208,
-17837.744786550902,
15235.434771455053,
17183.25815161994,
16835.95193044929,
21063.986176551374,
17987.99577807288,
-270.6290142721815,
-11239.957979217992,
-18724.854251002133,
-11752.820614075652,
-14332.597031388648,
-24609.22398631297,
-26155.98046224267,
18192.356438131075,
22165.14150786262,
26758.419290443217,
29284.65841543005,
25762.928479462924,
865.3393849464444,
-15121.264730579132,
-26306.45361387036,
-13494.286360139175,
-18089.58324839494,
-34738.184049794625,
-34718.87495981627,
21145.112760626133,
27322.030709198487,
37252.78168890166,
37846.98231395838,
33206.62103950547,
2092.870600583023,
-18537.521405900694,
-33955.48182565038,
-15445.551953312479,
-22284.152196532646,
-45880.94206326016,
-44229.92788481257,
24988.038646046363,
32958.71017047145,
49117.93320642349,
47304.760779654374,
40776.01828187993,
3403.573579093782,
-22402.79273128273,
-42361.96378730598,
-17190.060741565456,
-27378.2527904574,
-59155.49212555031,
-60122.10588005664,
26272.133100405994,
44887.192435244986,
69002.74742137044,
59037.928523261784,
42122.51604012313,
6075.663868325184,
-20631.710295791454,
-48088.66531781877,
-23396.29341809641,
-40847.479839729145,
-68317.87342502769,
-73679.4424942532,
28302.69374713241,
57321.16868946109,
83820.10748232565,
68399.66173487401,
44989.374076533895,
8830.088516704302,
-18149.500187183363,
-52028.5021898363,
-31013.963236266634,
-53956.5249205745,
-77250.59604604884,
-86642.45203443282,
30541.62328593645,
69812.47143595785,
98233.7834300242,
77385.915451272,
48189.69475295938,
11504.22579592029,
-15251.799652343976,
-55879.292898282,
-38956.992207762654,
-67210.9936142441,
-86636.69916492153,
-99845.12467446178,
32751.253099701484,
82656.01928819218,
113259.2399845611,
86532.20966362985,
51019.20889397171,
14289.09297146163,
-11777.371574335935,
-59627.30976102835,
-47170.18721199697,
-81027.36407627042,
-96178.09587995053,
-113526.93736260894,
34817.23859755824,
95927.57143777516,
128782.84687524068,
95920.65382048927,
53226.62965224956,
17272.000877533148,
-7716.869736424804,
-63110.06727848651,
-55696.68126167806,
-95538.60898488457,
-105325.08525283691,
-127600.17956244369,
36734.97589442811,
109601.51109750797,
144205.71977383518,
105517.48123365057,
54793.814706888734,
20380.77940730315,
-3119.1108027357986,
-66153.73274186133,
-64702.85998743505,
-110650.72884973585

I have a TimeSeries shown below. We can clearly see that there is a cyclical behavior in the data, where rise and fall happens every 12 months and so on, irrespective of if trend is rising/falling. I am struggling to find how to extract this frequency of 12 from the data using scipy.fft python library. Mostly every time repetition happens around 12 months, but it can be around 11 months as well or 13 months, but 12 is the most common frequency. I am using using Fourier transformation, but can't figure out how to extract this most common frequency (12) using some python library?

Thanks

Here is the data:

11130.578853063385,
6509.723592808,
5693.928982796129,
3415.099921325464,
-9299.291673374173,
-3388.284173885658,
-5577.9316298032745,
-3509.583232678111,
2285.99065857961,
3844.3061166856014,
-7383.526882168155,
-4622.792125020905,
2813.586128745183,
-1501.9405075290495,
8911.954971658348,
7800.444458471794,
-1190.7952377053866,
4768.791467877524,
2173.593871988719,
-2420.04786197912,
-2304.842777539399,
-3562.1788525161837,
-8766.208378658988,
-7655.936603573945,
-5890.9298543619125,
-9628.764012284291,
12124.740839506767,
12391.257220312522,
7512.253051850619,
12921.032383220418,
10063.270097922614,
-1350.2599176773563,
-6887.434936788105,
-11116.26528794868,
-10196.871058658888,
-10874.172006461778,
-15014.190086779208,
-17837.744786550902,
15235.434771455053,
17183.25815161994,
16835.95193044929,
21063.986176551374,
17987.99577807288,
-270.6290142721815,
-11239.957979217992,
-18724.854251002133,
-11752.820614075652,
-14332.597031388648,
-24609.22398631297,
-26155.98046224267,
18192.356438131075,
22165.14150786262,
26758.419290443217,
29284.65841543005,
25762.928479462924,
865.3393849464444,
-15121.264730579132,
-26306.45361387036,
-13494.286360139175,
-18089.58324839494,
-34738.184049794625,
-34718.87495981627,
21145.112760626133,
27322.030709198487,
37252.78168890166,
37846.98231395838,
33206.62103950547,
2092.870600583023,
-18537.521405900694,
-33955.48182565038,
-15445.551953312479,
-22284.152196532646,
-45880.94206326016,
-44229.92788481257,
24988.038646046363,
32958.71017047145,
49117.93320642349,
47304.760779654374,
40776.01828187993,
3403.573579093782,
-22402.79273128273,
-42361.96378730598,
-17190.060741565456,
-27378.2527904574,
-59155.49212555031,
-60122.10588005664,
26272.133100405994,
44887.192435244986,
69002.74742137044,
59037.928523261784,
42122.51604012313,
6075.663868325184,
-20631.710295791454,
-48088.66531781877,
-23396.29341809641,
-40847.479839729145,
-68317.87342502769,
-73679.4424942532,
28302.69374713241,
57321.16868946109,
83820.10748232565,
68399.66173487401,
44989.374076533895,
8830.088516704302,
-18149.500187183363,
-52028.5021898363,
-31013.963236266634,
-53956.5249205745,
-77250.59604604884,
-86642.45203443282,
30541.62328593645,
69812.47143595785,
98233.7834300242,
77385.915451272,
48189.69475295938,
11504.22579592029,
-15251.799652343976,
-55879.292898282,
-38956.992207762654,
-67210.9936142441,
-86636.69916492153,
-99845.12467446178,
32751.253099701484,
82656.01928819218,
113259.2399845611,
86532.20966362985,
51019.20889397171,
14289.09297146163,
-11777.371574335935,
-59627.30976102835,
-47170.18721199697,
-81027.36407627042,
-96178.09587995053,
-113526.93736260894,
34817.23859755824,
95927.57143777516,
128782.84687524068,
95920.65382048927,
53226.62965224956,
17272.000877533148,
-7716.869736424804,
-63110.06727848651,
-55696.68126167806,
-95538.60898488457,
-105325.08525283691,
-127600.17956244369,
36734.97589442811,
109601.51109750797,
144205.71977383518,
105517.48123365057,
54793.814706888734,
20380.77940730315,
-3119.1108027357986,
-66153.73274186133,
-64702.85998743505,
-110650.72884973585
Share Improve this question edited yesterday cph_sto asked Feb 7 at 16:44 cph_stocph_sto 7,58714 gold badges45 silver badges89 bronze badges 2
  • 1 Can you supply the time series in some form? – lastchance Commented Feb 7 at 18:10
  • Added the data. Sorry, I was off my workstation in the weekend. – cph_sto Commented yesterday
Add a comment  | 

1 Answer 1

Reset to default 1

Original time series wasn't provided, so I made some up. EDIT: newly-supplied data provided at the end of the post

Take the magnitude of the DFT using numpy.fft.fft or, for real data as here, numpy.fft.rfft, and calculate its amplitude. (If your data isn't equally spaced in time you can interpolate first.)

You can find peaks in the frequency trace, but note that there can be a lot of energy in low-frequency components, so you may need to exclude these.

Note that the frequency interval is 2.pi/tmax in rad/time_unit, or 1/tmax in cycles/time_unit, so this limits the accuracy you will get.

import numpy as np
import matplotlib.pyplot as plt

T = 150
N = 10000                                          # For accurate frequency, N and number of cycles must be large
t = np.linspace( 0.0, T, N, endpoint=False )
signal = 75000 * ( t / 150 ) ** 1.5 * ( 1 + 0.8 * np.sin( 2 * np.pi * t / 12 ) + 0.4 * np.sin( 2 * np.pi * t / 4 ) )

dw = 2 * np.pi / T                                 # Frequency interval in rad/time_unit
df = 1 / T                                         # Frequency interval in cycles/time_unit
dt = T / N

FT = np.fft.rfft( signal )                         # Use rfft for a real signal (note: only about half the components)
A = np.abs( FT ) / N                               # Normalised magnitude of Fourier components
w = np.linspace( 0.0, N * dw, N, endpoint=False )  # Frequency values in rad/time_unit

low = 4                                            # Number of low-frequency components to exclude
m = low + np.argmax( A[low:] )                     # Index of dominant frequency (excluded low frequencies)
omega = m * dw                                     # Dominant frequency in rad/s
f = omega / ( 2 * np.pi )                          # Dominant frequency in Hz
#f = np.fft.rfftfreq( N, dt )[m]                   # Alternative, direct from numpy.fft

print( 'Index:', m )
print( 'Frequency interval in rad/time_unit    (dw):', dw    )
print( 'Dominant frequency in rad/time_unit    (w) :', omega )
print( 'Frequency interval in cycles/time_unit (df):', df    )
print( 'Dominant frequency in cycles/time_unit (f) :', f     )

plt.title( "Time Domain" )
plt.plot( t, signal, 'b-' )
plt.xlabel( 't' )
plt.show()

plt.title( "Frequency Domain" )
plt.plot( w[:len(A)] / ( 2 * np.pi ), A, 'r-' )
plt.xlim( 0.0, 0.6 )
plt.xlabel( 'f' )
plt.show()

Output:

Index: 12
Frequency interval in rad/time_unit    (dw): 0.041887902047863905
Dominant frequency in rad/time_unit    (w) : 0.5026548245743668
Frequency interval in cycles/time_unit (df): 0.006666666666666667
Dominant frequency in cycles/time_unit (f) : 0.07999999999999999

A time period of 12 corresponds to an f-frequency of 1/12=0.083. Accuracy is limited by df.

EDIT: Supplied Data

Assuming the data is put, as-is, in a file "signal.dat", change the first part of the code to

import numpy as np
import matplotlib.pyplot as plt

signal = np.loadtxt( 'signal.dat', delimiter=',', usecols=0, unpack=True )
N = len(signal)
T = N
t = np.linspace( 0.0, T, N, endpoint=False )
dw = 2 * np.pi / T                                 # Frequency interval in rad/time_unit
df = 1 / T                                         # Frequency interval in cycles/time_unit
dt = T / N                                         # Time interval (will be 1 month)

Then dt = 1 (month), whilst N and T are (numerically) the number of months in the record.

The output gives f=0.083 as the maximum frequency, corresponding to a period of 1/f=12 months.

Index: 13
Frequency interval in rad/time_unit    (dw): 0.04027682889217683
Dominant frequency in rad/time_unit    (w) : 0.5235987755982988
Frequency interval in cycles/time_unit (df): 0.00641025641025641
Dominant frequency in cycles/time_unit (f) : 0.08333333333333333

发布评论

评论列表(0)

  1. 暂无评论