StdDev and Var in different data sets

DID

New member
Hello everybody, Im new here, but has been following the forum for some time.

I really hope that today you can help me with the formula I need for Excel here is my setup:

I have several data sets (up to 600.000 rows in total). In column A, I have a number (e.g. 1) represents data set 1 (i.e. row #2 till row #3.000 all marked as "1" i Column A). Data set 2 i represented as "2" still in column A, but ranging from row #3001 till ca #6000. And so it continues up to ca. 600.000, where I have ca. 3500 data sets. In Column B I have the corresponding results from the different data sets.

Now, I would like to make a formula for determining the Standard deviation and Variance for the different data sets corresponding results.

In other words, what I would like to the program to do is: "For all the "1"'s in Column A, I want to calculate the Variance and Std.Dev., and for all the "2" in Column A..... etc"

I really hope that you understand what I mean, and that you are able to help me...

All the best,
DID

Simi

New member
I am unsure how this will work on a large data set, you may run out of system resources.
I usually try to do calculations like this with a macro for large data comparisons.

I have sample data:
 Column A Column B Column C Column D Column E Column F 1 1 stdev 1 1.58113883 1 2 2 1.870828693 1 3 3 1.290994449 1 4 1 5 var 1 2.5 2 0 2 3.5 2 1 3 1.666666667 2 2 2 3 2 4 2 5 3 2 3 3 3 4 3 5

The formulas used in F2 is =STDEV(IF(A:A=E2,B:B,"")) you need to enter this as an array formula by hitting ctrl+shift+enter it will appear as {=STDEV(IF(A:A=E2,B:B,""))} afterwords.
just replace the stdev with var for your other formula, =VAR(IF(A:A=E6,B:B,""))

Hope this helps.

Simi

DID

New member
WOW Simi... you are truly a saviour... Also nice to get these "buzzwords" like array formula, to do further search, thats very helpful

Simi

New member
I'm glad this helped, I love helping people on this site. I have gotten lots of help myself from this great community.

DID

New member
Yes that is absolutely very helpful.

Just one more thing:

It is easy in you example to write 1, 2 and 3 in Column E, but with that many data sets, how do I make it automatically read one number (digit) of Column A and place them in Column E. OR can one sort the Column A, so it shows 1,2,3,.... And then copy the column A? I hope you get what I mean...

Simi

New member
is your data in column A all whole numbers 1 2 3...9 10? Do you know how many data sets you have?

you could just do this, put 1 in cell E3, and then in E4 put =E3+1, then copy that down as many data sets you have.
The formulas in column F will work even if the data in column A is not sorted.

otherwise yes copy column A to column E, then sort Column E, and remove the duplicates.

Again you could make a macro to take all unique values from column A and put them in column E.
Do you know any programming?

DID

New member
Thanks again. Unfortunately my data are not numerical, the first set starts with 31233 and otherwise no logic (except for 5 digits). Programming, not really. I have used "R" in some cases to do some next generation sequencing data analysis, but from help of others...

I have figured it out myself by inserting following in a new colum "B" in B2 with a header in both Column A and B, if I have 10 results: =INDEX(\$A\$2:\$A\$10, MATCH(0, COUNTIF(\$B\$1:B1, \$A\$2:\$A\$10), 0))

So far this have given me what I want (for others to use if they find this thread)...

is your data in column A all whole numbers 1 2 3...9 10? Do you know how many data sets you have?

you could just do this, put 1 in cell E3, and then in E4 put =E3+1, then copy that down as many data sets you have.
The formulas in column F will work even if the data in column A is not sorted.

otherwise yes copy column A to column E, then sort Column E, and remove the duplicates.

Again you could make a macro to take all unique values from column A and put them in column E.
Do you know any programming?

Simi

New member
I am attaching a sample sheet, please note for this to work, column A needs to be sorted in ascending order.
Also I only added this for the STDEV section.
check the formulas in E3:F12.
Note E3 is different than E4.

If you have questions please ask, and I hope this helps.

Attachments

• Sample2-1-14-14.xlsx
10.9 KB · Views: 8

Simi

New member
Well I'm glad you have gotten this working, I posted the same time you did