最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

pandas - How to calculate Correlation_matrix - Stack Overflow

programmeradmin4浏览0评论

I want to calculate the correlation matrix between two lists of variables, X[] and Y[]. Each has 10 values, the matrix should be a 20x20 because I need the correlation between them as well, but the result either shows NaNs or a matrix of all ones!

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

X = [2.3081292633385466, 236.1480117092457, 141.99869221411194, 37.124935350627766, 882.9003651182456, 10.615305656934307, 182.3427324361314, 47.45349908759125, 9.976779197080292, 118.0901003649635, 169.4717877819549, 19.535703695255474]
Y = [3.021000887481155, 240.7961033050437, 146.62088766156856, 37.34449838919769, 904.7717597164876, 10.758901188220385, 200.6302524676273, 50.51514528015716, 10.141817514463913, 118.30636956822278, 179.6051491778388, 20.591913658817106]

df = pd.DataFrame({'X': X, 'Y': Y})

correlation_matrix = pd.DataFrame(index=range(24), columns=range(24))

for i in range(12):
    for j in range(12):
        correlation_matrix.iloc[i, j] = pd.Series([X[i], X[j]]).corr(pd.Series([X[j], X[i]]))
        correlation_matrix.iloc[i, j+12] = pd.Series([X[i], Y[j]]).corr(pd.Series([Y[j], X[i]]))
        correlation_matrix.iloc[i+12, j] = pd.Series([Y[i], X[j]]).corr(pd.Series([X[j], Y[i]]))
        correlation_matrix.iloc[i+12, j+12] = pd.Series([Y[i], Y[j]]).corr(pd.Series([Y[j], Y[i]]))

correlation_matrix = correlation_matrix.apply(pd.to_numeric)

print("Correlation matrix:")
print(correlation_matrix)

plt.figure(figsize=(12, 10))
sns.heatmap(correlation_matrix, annot=True, fmt=".2f", cmap='coolwarm', cbar=True)
plt.title('Correlation Matrix for X and Y')
plt.show()
发布评论

评论列表(0)

  1. 暂无评论