当前位置:网站首页>Thermodynamic diagram display correlation matrix

Thermodynamic diagram display correlation matrix

2022-06-25 00:43:00 Dream painter

The Pearson correlation coefficient is usually used to quantify the relationship between two variables , That is to measure the linear correlation between variables .
Value range of correlation coefficient :[-1,1]:

  • -1 Indicates a completely negative linear correlation
  • 0 Indicates that there is no linear relationship
  • 1 Indicates a completely positive linear correlation

The farther away 0, The stronger the correlation . When the number of variables exceeds 2 Time , The correlation matrix is usually used to represent , That is, the correlation between each pair of variables is expressed in the form of square matrix .

Pass below Python An example shows how to calculate the correlation matrix , And use the thermodynamic diagram to represent .

Create correlation matrix

First create the sample data set :

import pandas as pd

data = {'assists': [4, 5, 5, 6, 7, 8, 8, 10],
        'rebounds': [12, 14, 13, 7, 8, 8, 9, 13],
        'points': [22, 24, 26, 26, 29, 32, 20, 14]
        }

df = pd.DataFrame(data, columns=['assists','rebounds','points'])
df

#   assist  rebounds  points
# 0	4	12	22
# 1	5	14	24
# 2	5	13	26
# 3	6	7	26
# 4	7	8	29
# 5	8	8	32
# 6	8	9	20
# 7	10	13	14

Let's calculate the correlation matrix :

#  Create correlation matrix 
df.corr()

#                 assists   rebounds     points
# assists        1.000000  -0.244861  -0.329573
# rebounds      -0.244861   1.000000  -0.522092
# points        -0.329573  -0.522092   1.000000

#  Create correlation matrix , Keep three decimal places 
df.corr().round(3)
# 	       assists	rebounds  points
# assists	1.000	  -0.245  -0.330
# rebounds	-0.245	   1.000  -0.522
# points	-0.330	  -0.522   1.000

The correlation coefficients of the diagonals of the table are 1, Indicates that each variable is completely autocorrelated . Other values represent the correlation coefficient of each pair of variables :

assists and rebounds The correlation coefficient is 0 -0.245.
assists and points The correlation coefficient is 0 -0.330.
rebounds and points The correlation coefficient is 0 -0.522.

Visualization of correlation matrix

Hypomorphism pandas Visual correlation matrix of style selection in package , Use different styles :

corr = df.corr()
  sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="Blues")
# sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="RdYlGn")
# sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="coolwarm")
# sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="bwr")
# sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="PuOr")
plt.title(' Correlation thermogram ')
plt.show()

|  Insert picture description here
 Insert picture description here

Complete code


import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

plt.rcParams["font.sans-serif"] = ["SimHei"]
plt.rcParams['axes.unicode_minus'] = False

data = {'assists': [4, 5, 5, 6, 7, 8, 8, 10],
        'rebounds': [12, 14, 13, 7, 8, 8, 9, 13],
        'points': [22, 24, 26, 26, 29, 32, 20, 14]
        }

df = pd.DataFrame(data, columns=['assists', 'rebounds', 'points'])

corr = df.corr()
sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="Blues")
# sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="RdYlGn")
# sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="coolwarm")
# sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="bwr")
# sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap="PuOr")
plt.title(' Correlation thermogram ')
plt.show()

原网站

版权声明
本文为[Dream painter]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/176/202206242002195949.html