当前位置:网站首页>Machine learning - principal component analysis (PCA)
Machine learning - principal component analysis (PCA)
2022-06-24 10:10:00 【Cpsu】
# Start by creating a random dataset with dependencies
import numpy as np
from matplotlib import pyplot as plt
from numpy import linalg
np.random.seed(2)
# Construct data set
x1=[i for i in np.arange(1,10,0.1)]
x2=[np.random.uniform(2,4)*i+np.random.randn() for i in x1]
plt.scatter(x1,x2)
#zeros Create one that conforms to shape The random matrix of , It's not all 0 matrix , It could also be random numbers
# Transform the data set into matrix form
x=np.zeros((90,2))
x[:,0]=np.array(x1)
x[:,1]=np.array(x2)
x.shape
#(90, 2)

The first step is centralization
#axis This parameter is used to select whether to calculate the average value in row direction or column direction
data_array=x
mean_array=np.mean(data_array,axis=0)
center_array=data_array-mean_array
# Or use subtract
center_array=np.subtract(data_array,np.mean(data_array,axis=0) )
The second step is to calculate the covariance matrix and eigenvalue 、 Eigenvector
#rowvar The parameter is to select whether the behavior is a sample or listed as a sample
cov_array=np.cov(center_array,rowvar=False)
eig_vals, eig_vects = linalg.eig(cov_array)
""" # The eigenvalue (array([ 1.23589914, 80.95385223]), # Eigenvector array([[-0.96430755, -0.26478471], [ 0.26478471, -0.96430755]])) Wherein, characteristic value 1.23589914 The corresponding eigenvector is array([-0.96430755,0.26478471]) """
# Here should be selected before K The largest eigenvalue is the principal component
# It is convenient to understand the algorithm. All eigenvalues are selected here
# Get the index of characteristic value sorting
val_index=np.argsort(eig_vals)
# The reverse
val_index=val_index[::-1]
# Select the corresponding eigenvector
eig_vect=eig_vects [:,val_index]
# Here we choose the first principal component matrix
np.dot(center_array, eig_vect)[:,0]

call sklearn Module for verification
from sklearn.decomposition import PCA
data_mat = x
pca = PCA(n_components=1)
pca.fit(data_mat)
x_p=pca.fit(data_mat).transform(data_mat)
x_p
# The results are consistent
边栏推荐
- 自定义kindeditor编辑器的工具栏,items即去除不必要的工具栏或者保留部分工具栏
- 学习使用KindEditor富文本编辑器,点击上传图片遮罩太大或白屏解决方案
- js单例模式
- 请问有国内靠谱低手续费的期货开户渠道吗?网上开户安全吗?
- 时尚的弹出模态登录注册窗口
- 411 stack and queue (20. valid parentheses, 1047. delete all adjacent duplicates in the string, 150. inverse Polish expression evaluation, 239. sliding window maximum, 347. the first k high-frequency
- 编程题(持续更新)
- Analysis of 43 cases of MATLAB neural network: Chapter 32 time series prediction of wavelet neural network - short-term traffic flow prediction
- Canvas draw picture
- 微信小程序學習之 實現列錶渲染和條件渲染.
猜你喜欢
随机推荐
np.float32()
canvas无限扫描js特效代码
物联网?快来看 Arduino 上云啦
Is there a reliable and low commission futures account opening channel in China? Is it safe to open an account online?
415-二叉树(144. 二叉树的前序遍历、145. 二叉树的后序遍历、94. 二叉树的中序遍历)
队列Queue
上升的气泡canvas破碎动画js特效
ssh远程免密登录
Arbre binaire partie 1
Go language development environment setup +goland configuration under the latest Windows
学习使用php实现无限极评论和无限极转二级评论解决方案
Operator details
观察者模式
Symbol.iterator 迭代器
Canvas draw picture
操作符详解
Wechat cloud hosting launch public beta: in the appointment of the publicity meeting
How to solve multi-channel customer communication problems in independent stations? This cross-border e-commerce plug-in must be known!
Jcim | AI based protein structure prediction in drug discovery: impacts and challenges
Open Oracle server under Linux to allow remote connection









