当前位置：网站首页>Numerical calculation method chapter7 Calculating eigenvalues and eigenvectors of matrices

Numerical calculation method chapter7 Calculating eigenvalues and eigenvectors of matrices

2022-06-23 06:40:00 【Espresso Macchiato】

Numerical calculation method Chapter7. Calculate the eigenvalues and eigenvectors of the matrix

0. Problem description

The problem in this chapter is to say , Given a $n$ Order matrix , How to solve its eigenvalue numerically , namely ：

$\lambda x$

1. Power law

1. Ideas

In fact, the main idea of power method still comes from the idea of iteration .

obviously , For any vector $\vec{x}$ , We can always use it $n$ A set of orthogonal bases of order matrix , namely ：

$\vec{x} = \sum_{i=1}^{n} x_i \cdot \vec{n_i}$

among , $\vec{n_i}$ For matrix $A$ A unit vector of , Yes $A\vec{n_i} = \lambda_i \vec{n_i}$ .

Make $\vec{x}^{(0)} = \sum_{i}x_i \vec{n_i}$ , $\vec{x}^{(k+1)} = A\vec{x}^{(k)}$ , Easy to have ：

$\vec{x}^{(k)} = \sum_{i} x_i \cdot \lambda_i^k \cdot \vec{n_i}$

The corresponding module length is ：

$||\vec{x}^{(k)}|| = \sqrt{\sum_i x_i^2 \lambda_i^{2k}}$

Let's consider being $\to \infty$ when $\vec{x}^{(k)}$ The direction of , Might as well set $max(|\lambda_i|) = \lambda$ .

Suppose there is only one positive $\lambda_i$ bring $|\lambda_i| = \lambda$
$\lim_{k \to \infty} \frac{\vec{x}^{(k)}}{||\vec{x}^{(k)}||} = \vec{n_1}$
At this time there is ：
$\vec{x}^{(k+1)} = A\vec{x}^{(k)} = \lambda \vec{x}^{(k)}$
Assume that $|\lambda_i| = \lambda$ Conditions of the $i$ share $m$ individual , And there are $\lambda_i > 0$
$\lim_{k \to \infty} \frac{\vec{x}^{(k)}}{||\vec{x}^{(k)}||} = \sum_{i}^{m}\frac{x_i}{\sqrt{\sum_i^m x_i^2}}\vec{n_i}$
There are also ：
$\vec{x}^{(k+1)} = A\vec{x}^{(k)} = \lambda \vec{x}^{(k)}$
Suppose there is $m$ A positive $|\lambda_i| = \lambda$ , $l$ A negative $|\lambda_i| = \lambda$
$\lim_{k \to \infty} \frac{\vec{x}^{(k)}}{||\vec{x}^{(k)}||} = \sum_{i}^{m}\frac{x_i}{\sqrt{\sum_i^{m+l} x_i^2}}\vec{n_i} + \sum_{j}^{l}\frac{(-1)^{k} \cdot x_j}{\sqrt{\sum_i^{m+l} x_j^2}}\vec{n_j}$
here , although $\vec{x}^{(k+1)} / \vec{x}^{(k)}$ unstable , But there is still ：
$\vec{x}^{(k+2)} = AA\vec{x}^{(k)} = \lambda^2 \vec{x}^{(k)}$

But here's the thing , What is discussed here is only a general case , It is based on the assumption that all $x_i \neq 0$ , If the projection on some components happens to be 0, That is, some $x_i = 0$ , Then the above discussion will change or even fail .

and , As analyzed above , By the power method , We can only obtain the eigenvalue with the largest absolute value in a general matrix $\lambda$ , Unable to get all its characteristic values , This also needs attention .

2. Canonical operation

Based on the above ideas , We give the normal operation method of power method ：

$\left\{ \begin{aligned} \vec{y}^{(k)} &= \vec{x}^{(k)} / ||\vec{x}^{(k)}|| \\ \vec{x}^{(k+1)} &= A \cdot \vec{y}^{(k)} \end{aligned} \right.$

Through the above iteration , We can find the matrix $A$ The absolute value of the maximum eigenvalue .

3. Pseudo code implementation

give python The pseudo code is implemented as follows ：

def power_fn(A, epsilon=1e-6):
    n = len(A)
    x = [1 for _ in range(n)]
    for _ in range(10**3):
        s = math.sqrt(sum(t*t for t in x))
        y = [t/s for t in x]
        z = [0 for _ in range(n)]
        for i in range(n):
            for j in range(n):
                z[i] += A[i][j] * y[j]
                
        lamda = sum([a/b for a, b in zip(z, y)]) / n
        delta =  max(abs(a-b) for a, b in zip(z, x))
        x = z
        if delta < epsilon:
            break
    return lamda

Of course , Only the maximum absolute value with a single sign is considered here $\lambda$ The situation of , If there is a maximum eigenvalue with the same absolute value and the opposite sign $\lambda$ , The above code will fail , Some complex operations and judgments are required , This is just a lot of expansion .

2. Inverse power method

1. Ideas & Method

The idea of inverse power method is not different from that of power method , But the power method iterates directly forward , namely ：

$\vec{x}^{(k+1)} = A \vec{x}^{(k)}$

The iteration method of the inverse power method is ：

$\vec{x}^{(k+1)} = A^{-1} \vec{x}^{(k)}$

Or say ：

$\vec{x}^{(k)} = A \vec{x}^{(k+1)}$

For the solution , We only need to add the method of solving multivariate equations in the previous chapter on the basis of the above power method .

What needs to be added is , Because the iteration used here is the opposite of the power method , therefore , The solution here is $A^{-1}$ The eigenvalue with the largest absolute value , That is to say $A$ The eigenvalue with the smallest absolute value .

And the same , The additional implicit condition here is the need for a matrix $A$ It's full rank , Otherwise, there is no inverse matrix , The above equation $\vec{x}^{(k)} = A \vec{x}^{(k+1)}$ There may be no solution .

2. Pseudo code implementation

alike , Here we give the simplest case （ namely $A$ The case of full rank and only one minimum absolute eigenvalue ）, Of the inverse power method python Pseudo code implementation ：

def rev_power_fn(A, epsilon=1e-6):
    n = len(A)
    x = [1 for _ in range(n)]
    for _ in range(10**3):
        s = math.sqrt(sum(t*t for t in x))
        y = [t/s for t in x]
        z = loose_iter(A, y)
                
        lamda = sum([a/b for a, b in zip(y, z)]) / n
        delta =  max(abs(a-b) for a, b in zip(z, x))
        x = z
        if delta < epsilon:
            break
    return lamda

3. Of a real symmetric matrix Jacobi Method

1. Ideas & Method

As mentioned earlier , Power method and anti power method are essentially to find a stable eigenvector through the idea of iteration , Then the eigenvalue is calculated by eigenvector .

therefore , They can only find an eigenvalue of the matrix , It is impossible to solve all eigenvalues of a matrix . If you want to solve all the eigenvalues of the matrix , The above methods will fail .

however , For some special matrices , That is, a real symmetric matrix , In fact, we can solve all the eigenvalues , A typical method is Jacobi Method .

In essence ,Jacobi The method still iterates , However, the idea of iteration is to continuously carry out unitary transformation on the matrix , Make it converge to a diagonal matrix , In this case, each diagonal element of the diagonal matrix is the eigenvalue of the original matrix .

To be specific , It is known that $\lambda x$ , There are orthogonal matrices $U$ Satisfy $UU^{T} = I$ , Then there are ：

$UAU^{T} \cdot Ux = \lambda \cdot Ux$

here , $\lambda$ Still matrix $UAU^{T}$ The eigenvalues of the .

If after a series of transformations ：

$U_kU_{k-1}...U_{1}AU_1^{T}U_2^{T}...U_{k}^T = diag(\lambda_1, ..., \lambda_n)$

be $\lambda_1, ..., \lambda_n$ That's the matrix $A$ All eigenvalues of .

The remaining question is how to solve these matrices $U$ ,Jacobi A feasible idea given by the method is Givens matrix , namely ：

$\theta) = \begin{pmatrix} 1 \\ & ... \\ & & cos\theta & ... & sin\theta \\ & & ... & & ... \\ & & -sin\theta & ... & cos\theta \\ & & & & & ... \\ & & & & & & 1 \end{pmatrix}$

in other words , Only right $p, q$ The elements of the two lines proceed at an angle of $\theta$ The rotation transformation of .

Make :

$G(p,q,\theta) \cdot A \cdot G^{T}(p,q,\theta)$

Then there are ：

$\left\{ \begin{aligned} b_{i, j} &= a_{i, j} \\ b_{i,p} &= b_{p,i} = a_{p, i}cos\theta - a_{q, i}sin\theta \\ b_{i,q} &= b_{q,i} = a_{p, i}sin\theta + a_{q, i}cos\theta \\ b_{p,p} &= a_{p,p}cos^{2}\theta + a_{q,q}sin^{2}\theta - a_{p,q}sin2\theta \\ b_{q,q} &= a_{p,p}sin^{2}\theta + a_{q,q}cos^{2}\theta + a_{p,q}sin2\theta \\ b_{p,q} &= b_{q,p} = a_{p,q} cos2\theta + \frac{a_{p,p} - a_{q,q}}{2}sin2\theta \end{aligned} \right.$

Let's take the right one $\theta$ , bring $b_{q,p} = 0$ , Then there are ：

$\left\{ \begin{aligned} \sum_{i\neq j} b_{i,j}^2 &= \sum_{i\neq j} a_{i,j}^2 - 2a_{p,q}^2 \\ \sum_{i} b_{i,i}^2 &= \sum_{i} a_{i,i}^2 + 2a_{p,q}^2 \end{aligned} \right.$

You can see , The absolute values of the elements of the off diagonal elements will become smaller and smaller .

therefore , After enough iterations , The original matrix can be $A$ Transform into a near diagonal matrix with the same eigenvalue .

In order to further improve the speed of iteration , The off diagonal element with the largest absolute value can be selected first for iterative elimination .

2. Pseudo code implementation

alike , We give python The pseudo code is implemented as follows ：

def jacobi_eigen_fn(A, epsilon=1e-6):
    n = len(A)
    assert(len(A[0]) == n)
    assert(A[i][j] == A[j][i] for i in range(n) for j in range(i+1, n))
    
    finished = False
    for _ in range(1000):
        for p in range(n):
            for q in range(p+1, n):
                if A[p][p] != A[q][q]:
                    theta = 0.5 * math.atan(2*A[p][q] / (A[q][q] - A[p][p]))
                else:
                    theta = math.pi / 4
                ap = deepcopy(A[p])
                aq = deepcopy(A[q])
                for i in range(n):
                    if i == p or i == q:
                        continue
                    A[i][p] = A[p][i] = ap[i] * math.cos(theta) - aq[i] * math.sin(theta)
                    A[i][q] = A[q][i] = ap[i] * math.sin(theta) + aq[i] * math.cos(theta)
                A[p][p] = ap[p] * math.cos(theta)**2 + aq[q] * math.sin(theta)**2 - ap[q] * math.sin(2*theta)
                A[q][q] = ap[p] * math.sin(theta)**2 + aq[q] * math.cos(theta)**2 + ap[q] * math.sin(2*theta)
                A[p][q] = A[q][p] = ap[q] * math.cos(2 * theta) + (ap[p] - aq[q])/2 * math.sin(2*theta)
                
                finished = all(A[i][j] < epsilon for i in range(n) for j in range(i+1, n))
                if finished:
                    break
            if finished:
                break
        if finished:
            break
    return [A[i][i] for i in range(n)]