最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

excel - Automating averages and stdevs for large dataset in set intervals of rows - Stack Overflow

programmeradmin2浏览0评论

I am currently trying to make a dataset which refers to a series of data in another tab, of which it takes 20 points (vertically) and averages them and gives a stdev then two open rows and then again average+stdev of the next 20 points. How can this be automated without having to manually insert all of the ranges?

Column A
'Blank'
'Avg1'
'StDev1'
'Blank'
'Blank'
'Avg2'
'StDev2'
'Blank'

I am currently trying to make a dataset which refers to a series of data in another tab, of which it takes 20 points (vertically) and averages them and gives a stdev then two open rows and then again average+stdev of the next 20 points. How can this be automated without having to manually insert all of the ranges?

Column A
'Blank'
'Avg1'
'StDev1'
'Blank'
'Blank'
'Avg2'
'StDev2'
'Blank'

This is the format used in an already setup datasheet to which I would like to be able to automate the above problem.

The first set is =Average('Series Data'!Q2:Q21) and if the next is set to =Average('Series Data'!Q22:Q41) and pulled down the next becomes =Average('Series Data'!Q10:Q11)

Share Improve this question edited Jan 29 at 14:41 Black cat 6,2995 gold badges28 silver badges60 bronze badges asked Jan 29 at 10:27 Jan van oestJan van oest 112 bronze badges 2
  • 1 " =Average('Series Data'!Q2:Q20) and if the next is set to =Average('Series Data'!Q22:Q42) and pulled down the next becomes =Average('Series Data'!Q10:Q11)" - I don't see any logical pattern here - do you mean something like 2-20, 22-40, 42-60? – Michal Commented Jan 29 at 10:59
  • It should be Q2:Q21, Q22:Q41, Q42:Q61, My mistake. Its sets of 20 points beginning from Q2. – Jan van oest Commented Jan 29 at 11:32
Add a comment  | 

6 Answers 6

Reset to default 3

Another option for a large dataset (e.g. 100k+ rows):

=LET(
    rws, QUOTIENT(XMATCH(,DROP('Series Data'!Q:Q,2)),20),
    arr, INDEX('Series Data'!Q:Q,SEQUENCE(rws,20,2)),
    pad, EXPAND("",rws,,""),
    TOCOL(HSTACK(pad,BYROW(arr,AVERAGE),BYROW(arr,STDEV),pad))
)

The number of iterations (rws) will be determined automatically, assuming there are no blank cells in your data series... XMATCH will return the row number of the first blank cell found in the lookup_array, when the lookup_value is omitted.

Reduce is perfect for this: =REDUCE("",SEQUENCE(3,,2,20),LAMBDA(a,b,LET(c,INDEX('Series Data'!Q:Q,SEQUENCE(20,,b)),VSTACK(a,AVERAGE(c),STDEV(c),""))))

Changing 3 will change the number of iterations of the formula

Put your data into a Table. In the formula, the table name is dataTbl and the column header is Data.

=LET(
    d, dataTbl[Data],
    a, WRAPCOLS(d, 20),
    avg, BYCOL(a, LAMBDA(arr, AVERAGE(arr))),
    stdev, BYCOL(a, LAMBDA(arr, STDEV.S(arr))),
    x, HSTACK(TOCOL(avg), TOCOL(stdev)),
    b, MAKEARRAY(
        COUNT(avg) * 4,
        1,
        LAMBDA(r, c, IFERROR(INDEX(x, INT((r - 1) / 4) + 1, MOD(r - 1, 4) + 1), ""))
    ),
    b
)
  • The 20 is the number of items in each summarization.
  • The 4 refers to the number of rows for each summarization
    • Avg
    • Std Dev
    • blank
    • blank

Enter the formula in row 2 or lower so as to have a blank row above.

Note that the STDEV function is being replaced by STDEV.S and other variations. You should read the documentation to ensure you are using the one you want.

In this formula adjust the
mul value (in the example 4) to the required count of blocks of the result.

=LET(first,'Series Data'!Q$2:Q$21,
mul,4,
cella,BYROW(SEQUENCE(mul,1,0,20),LAMBDA(x,AVERAGE(OFFSET(first,INDEX(x,1,1),0)))),
cellb,BYROW(SEQUENCE(mul,1,0,20),LAMBDA(x,STDEV(OFFSET(first,INDEX(x,1,1),0)))),
DROP(TOCOL(HSTACK(TOCOL(TEXTSPLIT(REPT("|",mul),"|")),cella,cellb,TOCOL(TEXTSPLIT(REPT("|",mul),"|")))),-4))

Copy it to cell A1, the formula will spill the result.

Another alternative could be this formula. For range 'Series Data'!Q1:Q1000000 the result was returned in ~ 1 second in my sample sheet.

=LET(_rng,WRAPROWS('Series Data'!Q1:Q120,20),
_avrg,BYROW(_rng,LAMBDA(r,AVERAGE(r))),
_stdev,BYROW(_rng,LAMBDA(r,STDEV(r))), 
_res,IFNA(EXPAND(HSTACK(_avrg,_stdev),,4),""),
VSTACK("",TOCOL(_res)))

Thank you all for your answers,

I've used this:

=LET(first,'Series Data'!Q$2:Q$21,
mul,108,
cella,BYROW(SEQUENCE(mul,1,0,20),LAMBDA(x,AVERAGE(OFFSET(first,INDEX(x,1,1),0)))),
cellb,BYROW(SEQUENCE(mul,1,0,20),LAMBDA(x,STDEV(OFFSET(first,INDEX(x,1,1),0)))),
DROP(TOCOL(HSTACK(TOCOL(TEXTSPLIT(REPT("|",mul),"|")),cella,cellb,TOCOL(TEXTSPLIT(REPT("|",mul),"|")))),-4))

Because 2161 is the maximum amount of data sets of 20, 108 is sufficient to harbor all possibilities. It gives the data how I requested, dropping down the avg, then stdev, then two blank rows.

发布评论

评论列表(0)

  1. 暂无评论