当前位置:网站首页>Machine learning - principal component analysis (PCA)
Machine learning - principal component analysis (PCA)
2022-06-24 10:10:00 【Cpsu】
# Start by creating a random dataset with dependencies
import numpy as np
from matplotlib import pyplot as plt
from numpy import linalg
np.random.seed(2)
# Construct data set
x1=[i for i in np.arange(1,10,0.1)]
x2=[np.random.uniform(2,4)*i+np.random.randn() for i in x1]
plt.scatter(x1,x2)
#zeros Create one that conforms to shape The random matrix of , It's not all 0 matrix , It could also be random numbers
# Transform the data set into matrix form
x=np.zeros((90,2))
x[:,0]=np.array(x1)
x[:,1]=np.array(x2)
x.shape
#(90, 2)
The first step is centralization
#axis This parameter is used to select whether to calculate the average value in row direction or column direction
data_array=x
mean_array=np.mean(data_array,axis=0)
center_array=data_array-mean_array
# Or use subtract
center_array=np.subtract(data_array,np.mean(data_array,axis=0) )
The second step is to calculate the covariance matrix and eigenvalue 、 Eigenvector
#rowvar The parameter is to select whether the behavior is a sample or listed as a sample
cov_array=np.cov(center_array,rowvar=False)
eig_vals, eig_vects = linalg.eig(cov_array)
""" # The eigenvalue (array([ 1.23589914, 80.95385223]), # Eigenvector array([[-0.96430755, -0.26478471], [ 0.26478471, -0.96430755]])) Wherein, characteristic value 1.23589914 The corresponding eigenvector is array([-0.96430755,0.26478471]) """
# Here should be selected before K The largest eigenvalue is the principal component
# It is convenient to understand the algorithm. All eigenvalues are selected here
# Get the index of characteristic value sorting
val_index=np.argsort(eig_vals)
# The reverse
val_index=val_index[::-1]
# Select the corresponding eigenvector
eig_vect=eig_vects [:,val_index]
# Here we choose the first principal component matrix
np.dot(center_array, eig_vect)[:,0]
call sklearn Module for verification
from sklearn.decomposition import PCA
data_mat = x
pca = PCA(n_components=1)
pca.fit(data_mat)
x_p=pca.fit(data_mat).transform(data_mat)
x_p
# The results are consistent
边栏推荐
- canvas掉落的小球重力js特效动画
- The great charm of cookies
- Desktop software development framework reward
- 英伟达这篇CVPR 2022 Oral火了!2D图像秒变逼真3D物体!虚拟爵士乐队来了!
- 学习使用php实现无限极评论和无限极转二级评论解决方案
- Wechat cloud hosting launch public beta: in the appointment of the publicity meeting
- 411-栈和队列(20. 有效的括号、1047. 删除字符串中的所有相邻重复项、150. 逆波兰表达式求值、239. 滑动窗口最大值、347. 前 K 个高频元素)
- 被困英西中学的师生安全和食物有保障
- PHP file lock
- Tutorial (5.0) 08 Fortinet security architecture integration and fortixdr * fortiedr * Fortinet network security expert NSE 5
猜你喜欢
SSH Remote Password free login
2021-08-17
js单例模式
Development of anti fleeing marketing software for health products
Analysis of 43 cases of MATLAB neural network: Chapter 32 time series prediction of wavelet neural network - short-term traffic flow prediction
JS singleton mode
Go language development environment setup +goland configuration under the latest Windows
SVG+js拖拽滑块圆形进度条
TP5 using post to receive array data times variable type error: solution to array error
Five heart matchmaker
随机推荐
How large and medium-sized enterprises build their own monitoring system
Endgame P.O.O
416 binary tree (first, middle and last order traversal iteration method)
JCIM|药物发现中基于AI的蛋白质结构预测:影响和挑战
Which of the top ten securities companies has the lowest Commission and is the safest and most reliable? Do you know anything
小程序 rich-text中图片点击放大与自适应大小问题
SQL Sever中的窗口函数row_number()rank()dense_rank()
Floating point notation (summarized from cs61c and CMU CSAPP)
Groovy obtains Jenkins credentials through withcredentials
416-二叉树(前中后序遍历—迭代法)
CVPR 2022 oral | NVIDIA proposes an efficient visual transformer network a-vit with adaptive token. The calculation of unimportant tokens can be stopped in advance
SSH Remote Password free login
Binary tree part I
How to standardize data center infrastructure management process
Development of anti fleeing marketing software for health products
p5.js实现的炫酷交互式动画js特效
How to solve multi-channel customer communication problems in independent stations? This cross-border e-commerce plug-in must be known!
Thinkphp5 clear the cache cache, temp cache and log cache under runtime
How do novices choose the grade of investment and financial products?
Geogebra instance clock