当前位置:网站首页>[AI practice] data normalization and standardization of machine learning data processing

[AI practice] data normalization and standardization of machine learning data processing

2022-06-23 07:15:00 szZack

Data normalization for machine learning data processing 、 Standardization

This paper introduces three normalization methods 、 Standardized methods .

1.min-max Standardization (Min-max normalization)

Linear transformation of the original data , Make the result fall to [0,1] Section , The conversion function is as follows :
x ∗ = x − m i n m a x − m i n x^*=\frac{x-min}{max-min} x=maxminxmin

  • matters needing attention
    max,min Must be fixed

2.z-score( Standard deviation ) Standardization

The mean value of the original data (mean) And standard deviation (standard deviation) Standardize data .

The processed data conform to the standard normal distribution , That is, the mean value is 0, The standard deviation is 1, The transformation function is :
x ∗ = x − μ σ x^* = \frac{x - μ }{σ} x=σxμ
among μ Is the mean of all sample data ,σ Is the standard deviation of all sample data .

3.nonlinearity( nonlinear ) normalization

The nonlinear normalization method is often used in Scenarios with large data differentiation , Some values are very large , Some are very small . Through some mathematical functions , Mapping the original values .

The method includes log, tangent etc. , According to the distribution of data , The curve that determines the nonlinear function :

  • Logarithmic function transformation method
    such as y = l n ( x ) y = ln(x) y=ln(x), The corresponding normalization method is :
    x ∗ = l n ( x ) l n ( m a x ) x^*= \frac{ln(x)}{ln(max)} x=ln(max)ln(x)
    among m a x max max Represents the maximum value of the sample data , x ∗ x^* x Is the normalized value , x x x For input value , And all sample data must be greater than or equal to 1.

  • Arctangent function transformation method
    The data can be normalized by using the arctangent function , namely
    x ∗ = a r c t a n ( x ) ∗ ( 2 / p i ) x^*= arctan(x)*(2/pi) x=arctan(x)(2/pi)
    When using this method, it should be noted that if the interval to be mapped is [0,1], Then the data should be greater than or equal to 0, Less than 0 The data to be mapped to [-1,0] On interval .

  • L2 Norm normalization method
    L2 Norm normalization is that each element of the eigenvector is divided by the vector L2 norm :
    x i ∗ = x i n o r m ( x ) x_i^*= \frac{x_i}{norm(x)} xi=norm(x)xi
    among , vector x ( x 1 , x 2 , . . . , x n ) x(x_1,x_2,...,x_n) x(x1,x2,...,xn) Of L2 The norm is defined as :
    n o r m ( x ) = x 1 2 + x 2 2 + . . . + x 1 n norm(x)=\sqrt{x_1^2+x_2^2+...+x_1^n} norm(x)=x12+x22+...+x1n
    characteristic : Converted data x ∗ x^* x The sum of squares is 1

原网站

版权声明
本文为[szZack]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206230623566559.html