当前位置:网站首页>Matlab tips (19) matrix analysis -- principal component analysis

Matlab tips (19) matrix analysis -- principal component analysis

2022-06-27 08:23:00 mozun2020

MATLAB Tips (19) matrix analysis -- Principal component analysis

Preface

MATLAB Learning about image processing is very friendly , You can start from scratch , There are many encapsulated functions that can be called directly for basic image processing , This series of articles is mainly to introduce some of you in MATLAB Some concept functions are commonly used in routine demonstration !

Principal component analysis (Principal Component Analysis,PCA), It's a statistical method . Through orthogonal transformation, a set of variables that may be correlated are transformed into a set of linear uncorrelated variables , The transformed set of variables is called principal component . In practical subjects , In order to comprehensively analyze the problem , Many variables related to this are often proposed ( Or factors ), Because each variable reflects some information of this subject to varying degrees . The principal component analysis (PCA) is first performed by K. Pearson (Karl Pearson) For non random variables , Later H. Hotelling extended this method to the case of random vectors . The size of information is usually measured in terms of sum of squares of deviation or variance .

The main steps of principal component analysis are as follows :

  1. Standardization of indicator data (SPSS Software automatic execution );

  2. Determination of correlation between indicators ;

  3. Determine the number of principal components m;

  4. The principal components Fi expression ;

  5. The principal components Fi name .

Principal component analysis is a statistical method of dimension reduction , It relies on an orthogonal transformation , The original random vector whose components are related is transformed into a new random vector whose components are not related , This is expressed algebraically by transforming the covariance matrix of the original random vector into a diagonal matrix , Geometrically, it is expressed as transforming the original coordinate system into a new orthogonal coordinate system , Make it point to the sample point with the most open distribution p Two orthogonal directions , Then, the multidimensional variable system is reduced , So that it can be transformed into a low dimensional variable system with a high precision , And then by constructing an appropriate value function , Further transform the low dimensional system into one dimensional system .

The example encountered when searching for data , Here to share ,MATLAB Version is MATLAB2015b.

One . MATLAB Simulation

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% function : matrix analysis -- Principal component analysis 
% Environmental Science :Win7,Matlab2015b
%Modi: C.S
% Time :2022-06-25
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%  Clear environment variables 
clear all
clc

tic

A=[1	1/5	 1	1/5	 1/6   1/6
5	1	 5	1/2	 1/2   1/2
2	1/5	 1  1/4	 1/6   1/4
5	1/2	 4	1	 1/2   1/2
6	2	 6	2	 1	   1
6	2	 4	2	 1	   1];

h=zscore(A); % Data standardization 
r=corrcoef(h); % Calculate the correlation coefficient matrix 
disp(' The calculated correlation coefficient matrix is as follows :');
disp(r);
[x,y,z]=pcacov(r);  % Calculate eigenvectors and eigenvalues 
s=zeros(size(z));
for i=1:length(z)
    s(i)=sum(z(1:i));
end
disp(' The first several eigenvalues of the correlation coefficient matrix and their contribution rates are calculated from the above :');
disp([z,s])
tg=[z,s];
f=repmat(sign(sum(x)),size(x,1),1);
x=x.*f;
n=input(' Please select before n The principal components to be calculated :\n');
disp(' The selected principal component coefficients are :');
for i=1:n
    xs(i,:)=(x(:,i)');
end
newdt=h*xs';
disp(' Take the contribution rate of the principal components as the weight , Construct the coefficient of principal component comprehensive evaluation model :');
q=((z(1:n)./100)')
w=input(' Whether the principal component comprehensive evaluation is required ?(y or n)\n');
if w==y
    df=h*x(:,1:n);
    tf=df*z(1:n)/100;
    [stf,ind]=sort(tf,'descend'); % In descending order 
    disp(' Ranking of comprehensive evaluation results of principal components :');
    px=[ind,stf]
else
    return;
end
toc

Two . Simulation results

 The calculated correlation coefficient matrix is as follows :
    1.0000    0.8374    0.9252    0.8344    0.9009    0.8941
    0.8374    1.0000    0.7750    0.9281    0.9766    0.9771
    0.9252    0.7750    1.0000    0.7110    0.8141    0.7946
    0.8344    0.9281    0.7110    1.0000    0.9749    0.9771
    0.9009    0.9766    0.8141    0.9749    1.0000    0.9967
    0.8941    0.9771    0.7946    0.9771    0.9967    1.0000

 The first several eigenvalues of the correlation coefficient matrix and their contribution rates are calculated from the above :
   90.7979   90.7979
    7.0303   97.8282
    1.4832   99.3114
    0.6456   99.9569
    0.0431  100.0000
    0.0000  100.0000

 Please select before n The principal components to be calculated :
4
 The selected principal component coefficients are :
 Take the contribution rate of the principal components as the weight , Construct the coefficient of principal component comprehensive evaluation model :

q =

    0.9080    0.0703    0.0148    0.0065

 Whether the principal component comprehensive evaluation is required ?(y or n)
y
 Ranking of comprehensive evaluation results of principal components :

px =

    5.0000    2.5471
    6.0000    2.1629
    2.0000    0.1177
    4.0000   -0.0866
    3.0000   -2.2277
    1.0000   -2.5134

 Time has passed  13.348336  second .

 Insert picture description here

3、 ... and . Summary

Examples of matrix principal component analysis methods , I remember learning this before , My understanding is , It can be applied when processing characteristic parameters , The main application scenario is also the dimensionality reduction of data , When information becomes redundant , It will also interfere with the application of prediction and identification , Therefore, it is necessary to filter and adjust the redundant information , And by PCA Dimension reduction , Keep only the information about the main ingredients , Better classification and prediction results will be obtained , Or because of hardware resources , Without losing actual performance requirements , The system response time can be improved by dimensionality reduction , It's also PCA An application scenario of the method . Actually MATLAB The functions of principal component analysis have been integrated , This is equivalent to rewriting , Take a note . Learn one every day MATLAB Little knowledge , Let's learn and make progress together !

原网站

版权声明
本文为[mozun2020]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/178/202206270813000913.html