Generate a summary table in Excel for multiple variables without PivotTable

I am trying to build a template in excel that summarizes a dataset according to test, sample type, and analyzer. Sorry - I've never posted a question about excel before, so please tell me if more info needed.

I need to calculate the Mean, Standard Deviation, and the Measurement Uncertainty for each test, separated by sample type and analyzer (MU = 1.96 x SD). My actual dataset has a few hundred 'Tests' two or three 'Types' for each, and two or four 'Analyzers'. There should be dozens of data points for each 'Result'. I will also make a few additional calcs, but keeping it simple for clarity.

Here is example data:

Test	Type	Name	Result
pH	Water	Analyzer1	6.1
pH	Water	Analyzer2	6.5
pO2	Water	Analyzer1	55
pO2	Water	Analyzer2	64
pH	Water	Analyzer1	6.3
pH	Water	Analyzer2	6.8
pO2	Water	Analyzer1	50
pO2	Water	Analyzer2	68
pH	Buffer	Analyzer1	5.9
pH	Buffer	Analyzer2	6.4
pO2	Buffer	Analyzer1	48
pO2	Buffer	Analyzer2	71
pH	Buffer	Analyzer1	6.4
pH	Buffer	Analyzer2	6.9
pO2	Buffer	Analyzer1	51
pO2	Buffer	Analyzer2	64

Here is example data:

Test	Type	Name	Result
pH	Water	Analyzer1	6.1
pH	Water	Analyzer2	6.5
pO2	Water	Analyzer1	55
pO2	Water	Analyzer2	64
pH	Water	Analyzer1	6.3
pH	Water	Analyzer2	6.8
pO2	Water	Analyzer1	50
pO2	Water	Analyzer2	68
pH	Buffer	Analyzer1	5.9
pH	Buffer	Analyzer2	6.4
pO2	Buffer	Analyzer1	48
pO2	Buffer	Analyzer2	71
pH	Buffer	Analyzer1	6.4
pH	Buffer	Analyzer2	6.9
pO2	Buffer	Analyzer1	51
pO2	Buffer	Analyzer2	64

The result I am after looks like this:

Test	Type	Name	Mean	SD	MU
pH	Water	Analyzer1	6.2	0.35	0.69
pH	Water	Analyzer2	6.7	0.35	0.69
pH	Buffer	Analyzer1	6.2	0.14	0.28
pH	Buffer	Analyzer2	6.7	0.21	0.42
pO2	Water	Analyzer1	52.5	3.54	6.93
pO2	Water	Analyzer2	66.0	2.83	5.54
pO2	Buffer	Analyzer1	49.5	2.12	4.16
pO2	Buffer	Analyzer2	67.5	4.95	9.70

I am trying to automate this as much as possible, but don't want to use VBA. Unless I am missing something, I can't use a PivotTable as I have to use the Mean/SD to calculate the MU. Thus far, I have tried creating a new table using:

=UNIQUE(FILTER(B:B,A:A="pH"))

which generates an array containing Buffer and Water, but I can't include AND() in the FILTER() function to further parse the data. I have tried a bit of XLOOKUP() but that doesn't seem to work either.

Very interested to hear an approach!

***Edits: I am using Microsoft® Excel® for Microsoft 365 MSO (Version 2408 Build 16.0.17928.20336) 32-bit Previous responses have employed GROUPBY() or PIVOTBY() but these functions do not appear present in this version of Excel.

Share Improve this question edited Mar 13 at 21:59 asked Mar 13 at 21:25 Dorton 3291 gold badge3 silver badges12 bronze badges

2 PIVOTBY is not available in my version of Excel. "Microsoft® Excel® for Microsoft 365 MSO (Version 2408 Build 16.0.17928.20336) 32-bit" - I have no choice on what version of Excel I have as this is installed by the anization – Dorton Commented Mar 13 at 21:40
1 GROUPBY also does not appear to be present... – Dorton Commented Mar 13 at 21:43
1 @Dorton please post those in the OP. So it helps others to know what version of excel you are using. – Mayukh Bhattacharya Commented Mar 13 at 21:44
1 @MayukhBhattacharya - that works like a charm! I will have to run through it to understand each step, but it does exactly what was asked. – Dorton Commented Mar 13 at 23:31
1 @Tom Sharpe - You solution below would be best as far as resource usage goes. I've crashed Excel a few times trying to apply the LET() function solutions. Unfortunately I have a few additional variables I need to calculate as well which means I can't only do that. – Dorton Commented Mar 14 at 15:40

| Show 5 more comments

3 Answers 3

Sorted by: Reset to default 4

Here is one way of accomplishing the desired output, using MAKEARRAY():

=LET(
     _Data, A1:D17,
     _Header, HSTACK(TAKE(_Data,1,3),"Mean","SD","MU"),
     _Body, DROP(_Data,1),
     _Catg, TAKE(_Body,,3),
     _Uniq, UNIQUE(_Catg),
     VSTACK(_Header, SORT(HSTACK(_Uniq, ROUND(MAKEARRAY(ROWS(_Uniq), 3, LAMBDA(r,c, 
     INDEX(LET(a, CHOOSECOLS(_Body,-1), 
     b, BYROW(_Catg, LAMBDA(x, AND(x=CHOOSEROWS(_Uniq, r)))), 
     d, FILTER(a,b), HSTACK(AVERAGE(d), STDEV(d), 1.96*STDEV(d))),c))),2)),{1,2})))

Or, Using GROUPBY()

=DROP(GROUPBY(A1:C17,D1:D17,HSTACK(AVERAGE,STDEV,LAMBDA(x,1.96*STDEV(x))),1,0),1)

In the absence of GROUPBY, another option is to use the concept of "thunks" (parameter-less lambda functions) to minimize redundant calculations and improve efficiency. For example:

=LET(
   fields, A2:C17,
   values, D2:D17,
   key_id, BYROW(fields, LAMBDA(x, TEXTJOIN("|", 0, x))),
   keys,   UNIQUE(SORT(HSTACK(key_id, fields))),
   vals,   MAP(TAKE(keys,, 1), LAMBDA(id, LET(x, FILTER(values, key_id = id), LAMBDA(x)))),
   func,   HSTACK(LAMBDA(x, AVERAGE(x)), LAMBDA(x, STDEV.S(x))),
   result, MAP(IFNA(func, vals), IFNA(vals, func), LAMBDA(fnλ,val, fnλ(val()))),
   VSTACK(HSTACK(A1:C1, "Mean", "SD", "MU"), HSTACK(DROP(keys,, 1), result, TAKE(result,, -1) * 1.96))
)

The vals variable returns a vertical vector of "thunks", LAMBDA(x), each one containing the filtered results for each unique set of keys. The func variable is defined as a horizontal vector of functions to be applied to each set of results. The elements of both vectors are TYPE 128 (compound data). IFNA is used to broadcast the func vector across each row of the vals vector, as well as the vals vector across each column of the func vector, then MAP loops through both arrays together, applying each function to each set of values. By doing it this way, the FILTER function is only evaluated ONCE per output row, the results of which are broadcast (repeated) across multiple columns and recalled as needed simply by providing the opening and closing brackets, e.g. val().

Alternatively, each function can also be mapped individually, and the results stacked together manually:

=LET(
   fields, A2:C17,
   values, D2:D17,
   key_id, BYROW(fields, LAMBDA(x, TEXTJOIN("|", 0, x))),
   keys,   UNIQUE(SORT(HSTACK(key_id, fields))),
   vals,   MAP(TAKE(keys,, 1), LAMBDA(id, LET(x, FILTER(values, key_id = id), LAMBDA(x)))),
   mean,   MAP(vals, LAMBDA(x, AVERAGE(x()))),
   sdev,   MAP(vals, LAMBDA(x, STDEV.S(x()))),
   VSTACK(HSTACK(A1:C1, "Mean", "SD", "MU"), HSTACK(DROP(keys,, 1), mean, sdev, sdev * 1.96))
)

Or REDUCE can be used to loop through the func vector and HSTACK the results automatically:

=LET(
   fields, A2:C17,
   values, D2:D17,
   key_id, BYROW(fields, LAMBDA(x, TEXTJOIN("|", 0, x))),
   keys,   UNIQUE(SORT(HSTACK(key_id, fields))),
   vals,   MAP(TAKE(keys,, 1), LAMBDA(id, LET(x, FILTER(values, key_id = id), LAMBDA(x)))),
   func,   HSTACK(LAMBDA(x, AVERAGE(x())), LAMBDA(x, STDEV.S(x())), LAMBDA(x, STDEV.S(x()) * 1.96)),
   result, REDUCE(DROP(keys,, 1), func, LAMBDA(a,fnλ, HSTACK(a, MAP(vals, fnλ)))),
   VSTACK(HSTACK(A1:C1, "Mean", "SD", "MU"), result)
)

Sample Results:

A minimalist solution, using the fact that

SD(x) * k = SD(x * k)

Add an extra column to the data

Test	Type	Name	Result	ResultX1.96
pH	Water	Analyzer1	6.1	11.956
pH	Water	Analyzer2	6.5	12.74
pO2	Water	Analyzer1	55	107.8
pO2	Water	Analyzer2	64	125.44
pH	Water	Analyzer1	6.3	12.348
pH	Water	Analyzer2	6.8	13.328
pO2	Water	Analyzer1	50	98
pO2	Water	Analyzer2	68	133.28
pH	Buffer	Analyzer1	5.9	11.564
pH	Buffer	Analyzer2	6.4	12.544
pO2	Buffer	Analyzer1	48	94.08
pO2	Buffer	Analyzer2	71	139.16
pH	Buffer	Analyzer1	6.4	12.544
pH	Buffer	Analyzer2	6.9	13.524
pO2	Buffer	Analyzer1	51	99.96
pO2	Buffer	Analyzer2	64	125.44

and use a pivot table with tabular layout and no subtotals

Test	Type	Name	Average of Result	StdDev of Result	StdDev of ResultX1.96
pH	Buffer	Analyzer1	6.15	0.353553391	0.692964646
pH	Buffer	Analyzer2	6.65	0.353553391	0.692964646
pH	Water	Analyzer1	6.2	0.141421356	0.277185858
pH	Water	Analyzer2	6.65	0.212132034	0.415778787
pO2	Buffer	Analyzer1	49.5	2.121320344	4.157787873
pO2	Buffer	Analyzer2	67.5	4.949747468	9.701505038
pO2	Water	Analyzer1	52.5	3.535533906	6.929646456
pO2	Water	Analyzer2	66	2.828427125	5.543717165
Grand Total			32.64375	27.7691668	54.42756694

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Generate a summary table in Excel for multiple variables without PivotTable - Stack Overflow

3 Answers 3

与本文相关的文章

评论列表(0)