最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Generate a summary table in Excel for multiple variables without PivotTable - Stack Overflow

programmeradmin1浏览0评论

I am trying to build a template in excel that summarizes a dataset according to test, sample type, and analyzer. Sorry - I've never posted a question about excel before, so please tell me if more info needed.

I need to calculate the Mean, Standard Deviation, and the Measurement Uncertainty for each test, separated by sample type and analyzer (MU = 1.96 x SD). My actual dataset has a few hundred 'Tests' two or three 'Types' for each, and two or four 'Analyzers'. There should be dozens of data points for each 'Result'. I will also make a few additional calcs, but keeping it simple for clarity.

Here is example data:

Test Type Name Result
pH Water Analyzer1 6.1
pH Water Analyzer2 6.5
pO2 Water Analyzer1 55
pO2 Water Analyzer2 64
pH Water Analyzer1 6.3
pH Water Analyzer2 6.8
pO2 Water Analyzer1 50
pO2 Water Analyzer2 68
pH Buffer Analyzer1 5.9
pH Buffer Analyzer2 6.4
pO2 Buffer Analyzer1 48
pO2 Buffer Analyzer2 71
pH Buffer Analyzer1 6.4
pH Buffer Analyzer2 6.9
pO2 Buffer Analyzer1 51
pO2 Buffer Analyzer2 64

I am trying to build a template in excel that summarizes a dataset according to test, sample type, and analyzer. Sorry - I've never posted a question about excel before, so please tell me if more info needed.

I need to calculate the Mean, Standard Deviation, and the Measurement Uncertainty for each test, separated by sample type and analyzer (MU = 1.96 x SD). My actual dataset has a few hundred 'Tests' two or three 'Types' for each, and two or four 'Analyzers'. There should be dozens of data points for each 'Result'. I will also make a few additional calcs, but keeping it simple for clarity.

Here is example data:

Test Type Name Result
pH Water Analyzer1 6.1
pH Water Analyzer2 6.5
pO2 Water Analyzer1 55
pO2 Water Analyzer2 64
pH Water Analyzer1 6.3
pH Water Analyzer2 6.8
pO2 Water Analyzer1 50
pO2 Water Analyzer2 68
pH Buffer Analyzer1 5.9
pH Buffer Analyzer2 6.4
pO2 Buffer Analyzer1 48
pO2 Buffer Analyzer2 71
pH Buffer Analyzer1 6.4
pH Buffer Analyzer2 6.9
pO2 Buffer Analyzer1 51
pO2 Buffer Analyzer2 64

The result I am after looks like this:

Test Type Name Mean SD MU
pH Water Analyzer1 6.2 0.35 0.69
pH Water Analyzer2 6.7 0.35 0.69
pH Buffer Analyzer1 6.2 0.14 0.28
pH Buffer Analyzer2 6.7 0.21 0.42
pO2 Water Analyzer1 52.5 3.54 6.93
pO2 Water Analyzer2 66.0 2.83 5.54
pO2 Buffer Analyzer1 49.5 2.12 4.16
pO2 Buffer Analyzer2 67.5 4.95 9.70

I am trying to automate this as much as possible, but don't want to use VBA. Unless I am missing something, I can't use a PivotTable as I have to use the Mean/SD to calculate the MU. Thus far, I have tried creating a new table using:

=UNIQUE(FILTER(B:B,A:A="pH"))

which generates an array containing Buffer and Water, but I can't include AND() in the FILTER() function to further parse the data. I have tried a bit of XLOOKUP() but that doesn't seem to work either.

Very interested to hear an approach!

***Edits: I am using Microsoft® Excel® for Microsoft 365 MSO (Version 2408 Build 16.0.17928.20336) 32-bit Previous responses have employed GROUPBY() or PIVOTBY() but these functions do not appear present in this version of Excel.

Share Improve this question edited Mar 13 at 21:59 Dorton asked Mar 13 at 21:25 DortonDorton 3291 gold badge3 silver badges12 bronze badges 10
  • 2 PIVOTBY is not available in my version of Excel. "Microsoft® Excel® for Microsoft 365 MSO (Version 2408 Build 16.0.17928.20336) 32-bit" - I have no choice on what version of Excel I have as this is installed by the anization – Dorton Commented Mar 13 at 21:40
  • 1 GROUPBY also does not appear to be present... – Dorton Commented Mar 13 at 21:43
  • 1 @Dorton please post those in the OP. So it helps others to know what version of excel you are using. – Mayukh Bhattacharya Commented Mar 13 at 21:44
  • 1 @MayukhBhattacharya - that works like a charm! I will have to run through it to understand each step, but it does exactly what was asked. – Dorton Commented Mar 13 at 23:31
  • 1 @Tom Sharpe - You solution below would be best as far as resource usage goes. I've crashed Excel a few times trying to apply the LET() function solutions. Unfortunately I have a few additional variables I need to calculate as well which means I can't only do that. – Dorton Commented Mar 14 at 15:40
 |  Show 5 more comments

3 Answers 3

Reset to default 4

Here is one way of accomplishing the desired output, using MAKEARRAY():

=LET(
     _Data, A1:D17,
     _Header, HSTACK(TAKE(_Data,1,3),"Mean","SD","MU"),
     _Body, DROP(_Data,1),
     _Catg, TAKE(_Body,,3),
     _Uniq, UNIQUE(_Catg),
     VSTACK(_Header, SORT(HSTACK(_Uniq, ROUND(MAKEARRAY(ROWS(_Uniq), 3, LAMBDA(r,c, 
     INDEX(LET(a, CHOOSECOLS(_Body,-1), 
     b, BYROW(_Catg, LAMBDA(x, AND(x=CHOOSEROWS(_Uniq, r)))), 
     d, FILTER(a,b), HSTACK(AVERAGE(d), STDEV(d), 1.96*STDEV(d))),c))),2)),{1,2})))

Or, Using GROUPBY()

=DROP(GROUPBY(A1:C17,D1:D17,HSTACK(AVERAGE,STDEV,LAMBDA(x,1.96*STDEV(x))),1,0),1)

In the absence of GROUPBY, another option is to use the concept of "thunks" (parameter-less lambda functions) to minimize redundant calculations and improve efficiency. For example:

=LET(
   fields, A2:C17,
   values, D2:D17,
   key_id, BYROW(fields, LAMBDA(x, TEXTJOIN("|", 0, x))),
   keys,   UNIQUE(SORT(HSTACK(key_id, fields))),
   vals,   MAP(TAKE(keys,, 1), LAMBDA(id, LET(x, FILTER(values, key_id = id), LAMBDA(x)))),
   func,   HSTACK(LAMBDA(x, AVERAGE(x)), LAMBDA(x, STDEV.S(x))),
   result, MAP(IFNA(func, vals), IFNA(vals, func), LAMBDA(fnλ,val, fnλ(val()))),
   VSTACK(HSTACK(A1:C1, "Mean", "SD", "MU"), HSTACK(DROP(keys,, 1), result, TAKE(result,, -1) * 1.96))
)

The vals variable returns a vertical vector of "thunks", LAMBDA(x), each one containing the filtered results for each unique set of keys. The func variable is defined as a horizontal vector of functions to be applied to each set of results. The elements of both vectors are TYPE 128 (compound data). IFNA is used to broadcast the func vector across each row of the vals vector, as well as the vals vector across each column of the func vector, then MAP loops through both arrays together, applying each function to each set of values. By doing it this way, the FILTER function is only evaluated ONCE per output row, the results of which are broadcast (repeated) across multiple columns and recalled as needed simply by providing the opening and closing brackets, e.g. val().

Alternatively, each function can also be mapped individually, and the results stacked together manually:

=LET(
   fields, A2:C17,
   values, D2:D17,
   key_id, BYROW(fields, LAMBDA(x, TEXTJOIN("|", 0, x))),
   keys,   UNIQUE(SORT(HSTACK(key_id, fields))),
   vals,   MAP(TAKE(keys,, 1), LAMBDA(id, LET(x, FILTER(values, key_id = id), LAMBDA(x)))),
   mean,   MAP(vals, LAMBDA(x, AVERAGE(x()))),
   sdev,   MAP(vals, LAMBDA(x, STDEV.S(x()))),
   VSTACK(HSTACK(A1:C1, "Mean", "SD", "MU"), HSTACK(DROP(keys,, 1), mean, sdev, sdev * 1.96))
)

Or REDUCE can be used to loop through the func vector and HSTACK the results automatically:

=LET(
   fields, A2:C17,
   values, D2:D17,
   key_id, BYROW(fields, LAMBDA(x, TEXTJOIN("|", 0, x))),
   keys,   UNIQUE(SORT(HSTACK(key_id, fields))),
   vals,   MAP(TAKE(keys,, 1), LAMBDA(id, LET(x, FILTER(values, key_id = id), LAMBDA(x)))),
   func,   HSTACK(LAMBDA(x, AVERAGE(x())), LAMBDA(x, STDEV.S(x())), LAMBDA(x, STDEV.S(x()) * 1.96)),
   result, REDUCE(DROP(keys,, 1), func, LAMBDA(a,fnλ, HSTACK(a, MAP(vals, fnλ)))),
   VSTACK(HSTACK(A1:C1, "Mean", "SD", "MU"), result)
)

Sample Results:

A minimalist solution, using the fact that

SD(x) * k = SD(x * k)

Add an extra column to the data

Test Type Name Result ResultX1.96
pH Water Analyzer1 6.1 11.956
pH Water Analyzer2 6.5 12.74
pO2 Water Analyzer1 55 107.8
pO2 Water Analyzer2 64 125.44
pH Water Analyzer1 6.3 12.348
pH Water Analyzer2 6.8 13.328
pO2 Water Analyzer1 50 98
pO2 Water Analyzer2 68 133.28
pH Buffer Analyzer1 5.9 11.564
pH Buffer Analyzer2 6.4 12.544
pO2 Buffer Analyzer1 48 94.08
pO2 Buffer Analyzer2 71 139.16
pH Buffer Analyzer1 6.4 12.544
pH Buffer Analyzer2 6.9 13.524
pO2 Buffer Analyzer1 51 99.96
pO2 Buffer Analyzer2 64 125.44

and use a pivot table with tabular layout and no subtotals

Test Type Name Average of Result StdDev of Result StdDev of ResultX1.96
pH Buffer Analyzer1 6.15 0.353553391 0.692964646
pH Buffer Analyzer2 6.65 0.353553391 0.692964646
pH Water Analyzer1 6.2 0.141421356 0.277185858
pH Water Analyzer2 6.65 0.212132034 0.415778787
pO2 Buffer Analyzer1 49.5 2.121320344 4.157787873
pO2 Buffer Analyzer2 67.5 4.949747468 9.701505038
pO2 Water Analyzer1 52.5 3.535533906 6.929646456
pO2 Water Analyzer2 66 2.828427125 5.543717165
Grand Total 32.64375 27.7691668 54.42756694
发布评论

评论列表(0)

  1. 暂无评论