当前位置:网站首页>[AI practice] data normalization and standardization of machine learning data processing
[AI practice] data normalization and standardization of machine learning data processing
2022-06-23 07:15:00 【szZack】
Data normalization for machine learning data processing 、 Standardization
This paper introduces three normalization methods 、 Standardized methods .
1.min-max Standardization (Min-max normalization)
Linear transformation of the original data , Make the result fall to [0,1] Section , The conversion function is as follows :
x ∗ = x − m i n m a x − m i n x^*=\frac{x-min}{max-min} x∗=max−minx−min
- matters needing attention
max,min Must be fixed
2.z-score( Standard deviation ) Standardization
The mean value of the original data (mean) And standard deviation (standard deviation) Standardize data .
The processed data conform to the standard normal distribution , That is, the mean value is 0, The standard deviation is 1, The transformation function is :
x ∗ = x − μ σ x^* = \frac{x - μ }{σ} x∗=σx−μ
among μ Is the mean of all sample data ,σ Is the standard deviation of all sample data .
3.nonlinearity( nonlinear ) normalization
The nonlinear normalization method is often used in Scenarios with large data differentiation , Some values are very large , Some are very small . Through some mathematical functions , Mapping the original values .
The method includes log, tangent etc. , According to the distribution of data , The curve that determines the nonlinear function :
Logarithmic function transformation method
such as y = l n ( x ) y = ln(x) y=ln(x), The corresponding normalization method is :
x ∗ = l n ( x ) l n ( m a x ) x^*= \frac{ln(x)}{ln(max)} x∗=ln(max)ln(x)
among m a x max max Represents the maximum value of the sample data , x ∗ x^* x∗ Is the normalized value , x x x For input value , And all sample data must be greater than or equal to 1.Arctangent function transformation method
The data can be normalized by using the arctangent function , namely
x ∗ = a r c t a n ( x ) ∗ ( 2 / p i ) x^*= arctan(x)*(2/pi) x∗=arctan(x)∗(2/pi)
When using this method, it should be noted that if the interval to be mapped is [0,1], Then the data should be greater than or equal to 0, Less than 0 The data to be mapped to [-1,0] On interval .L2 Norm normalization method
L2 Norm normalization is that each element of the eigenvector is divided by the vector L2 norm :
x i ∗ = x i n o r m ( x ) x_i^*= \frac{x_i}{norm(x)} xi∗=norm(x)xi
among , vector x ( x 1 , x 2 , . . . , x n ) x(x_1,x_2,...,x_n) x(x1,x2,...,xn) Of L2 The norm is defined as :
n o r m ( x ) = x 1 2 + x 2 2 + . . . + x 1 n norm(x)=\sqrt{x_1^2+x_2^2+...+x_1^n} norm(x)=x12+x22+...+x1n
characteristic : Converted data x ∗ x^* x∗ The sum of squares is 1
边栏推荐
- In depth learning series 46: face image super score gfp-gan
- 宝塔忘记密码
- 100 GIS practical application cases (79) - key points of making multi plan integrated base map
- Mysql(十一) — MySQL面试题整理
- 技术文章写作指南
- 897. 递增顺序搜索树
- Nacos适配oracle11g-修改Nacos源码
- Initialization layer implementation
- Specific help of OSI layered model to work
- SSM整合
猜你喜欢

20220621 Three Conjugates of Dual Quaternions

GloRe

20BN-Jester完整数据集下载

RFID数据安全性实验:C#可视化实现奇偶校验、CRC冗余校验、海明码校验

Unet代码实现

聚焦行业,赋能客户 | 博云容器云产品族五大行业解决方案发布

Akamai-1.75版本-_abck参数生成-js逆向分析

407 stack and queue (232. implementing queue with stack, 225. implementing stack with queue)
![Don't look for [12 super easy-to-use Google plug-ins are here] (are you sure you want to take a look?)](/img/45/3e43faf7aba6741825ccb9719b8445.png)
Don't look for [12 super easy-to-use Google plug-ins are here] (are you sure you want to take a look?)

【***数组***】
随机推荐
322. change exchange
什么是分布式?
js 判断两个数组增加和减少的元素
How to verify date format in PHP (regular)
Paddle version problem
MySQL(二) — MySQL数据类型
896. monotonic sequence
896. 单调数列
如何达到高效的网络信息传播
Verilog syntax explanation
Nacos适配oracle11g-建表ddl语句
MySQL(八) — 执行计划(Explain)详解
Mysql事务隔离级别
用户态和内核态
npm下载报错npm ERR code ERESOLVE
GINet
Pagoda forgot password
.h5文件忘记数据库名字,使用h5py打印
paddle版本问题
ldconfig 命令