I am trying to compute the Spearman rank correlation coefficient between two ranked lists using scipy.stats.spearmanr and manually using the formula:
However, I am getting significantly different results.
I have two lists representing the top 10 similar users identified using two different methods:
Method 1 Rankings:
User ID: [301, 597, 414, 477, 57, 369, 206, 535, 590, 418]
Rank: [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Method 2 Rankings:
User ID: [301, 477, 19, 120, 75, 57, 597, 160, 577, 369]
Rank: [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
I then applied Spearman’s rank correlation using scipy.stats.spearmanr:
from scipy.stats import spearmanr
method1 = [301, 597, 414, 477, 57, 369, 206, 535, 590, 418]
method2 = [301, 477, 19, 120, 75, 57, 597, 160, 577, 369]
coef, _ = spearmanr(method1, method2)
print(f"Spearman coefficient: {coef}")
Spearman coefficient: 0.2727
Manual Calculation: To verify, I computed ρ manually using the common users between both lists.
User ID | Rank (Method 1) | Rank (Method 2) | (d_i = R_1 - R_2) | (d_i^2) |
---|---|---|---|---|
301 | 1 | 1 | 0 | 0 |
597 | 2 | 7 | -5 | 25 |
477 | 4 | 2 | 2 | 4 |
57 | 5 | 6 | -1 | 1 |
369 | 6 | 10 | -4 | 16 |
I am trying to compute the Spearman rank correlation coefficient between two ranked lists using scipy.stats.spearmanr and manually using the formula:
However, I am getting significantly different results.
I have two lists representing the top 10 similar users identified using two different methods:
Method 1 Rankings:
User ID: [301, 597, 414, 477, 57, 369, 206, 535, 590, 418]
Rank: [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Method 2 Rankings:
User ID: [301, 477, 19, 120, 75, 57, 597, 160, 577, 369]
Rank: [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
I then applied Spearman’s rank correlation using scipy.stats.spearmanr:
from scipy.stats import spearmanr
method1 = [301, 597, 414, 477, 57, 369, 206, 535, 590, 418]
method2 = [301, 477, 19, 120, 75, 57, 597, 160, 577, 369]
coef, _ = spearmanr(method1, method2)
print(f"Spearman coefficient: {coef}")
Spearman coefficient: 0.2727
Manual Calculation: To verify, I computed ρ manually using the common users between both lists.
User ID | Rank (Method 1) | Rank (Method 2) | (d_i = R_1 - R_2) | (d_i^2) |
---|---|---|---|---|
301 | 1 | 1 | 0 | 0 |
597 | 2 | 7 | -5 | 25 |
477 | 4 | 2 | 2 | 4 |
57 | 5 | 6 | -1 | 1 |
369 | 6 | 10 | -4 | 16 |
we have only 5 common users, n=5.
Using the formula I got Spearman coefficient(