当前位置：网站首页>Inaccurate data accuracy in ETL process

Inaccurate data accuracy in ETL process

2022-06-26 15:11:00 【RestCloud】

Recently, a classmate was using Restcloud ETL Product data integration , After the data is transferred to the target database table , Inaccurate data accuracy .

The scene is ： from oracle Source table data The format is ：number(21,6) Synchronize data to mysql The data format of the target table is ：float(21,6) ; Synchronous data Find out oracle yes ：538121.47 Synchronize to mysql In the database ：538121.50, See here , Inevitably, some students will think it is a product problem , Let's analyze .
First , We need to understand the difference between data and computer . Inside the computer , There are two ways to express decimals : Fixed point and floating point numbers .
1、 floating-point (float and double) Floating point type stores approximate values in the database
MySQL data type meaning
float(m,d) Single precision floating point 8 Bit accuracy (4 byte ) m Total number ,d⼩ digit
double(m,d) Double precision floating point 16 Bit accuracy (8 byte ) m Total number ,d⼩ digit
set up ⼀ Fields defined as float(5,3), If you insert ⼊⼀ Number 123.45678, Actual database ⾥ Deposit is 123.457, But the total number is still subject to the actual , namely 6 position .
2、 Fixed-point number Fixed point types store precise values in the database
Floating point type stores approximate values in the database ,⽽ The fixed-point type is stored in the database as an exact .decimal(m,d) Parameters m<65 It's the total number ,d<30 And d<m yes ⼩ digit .

For single precision floating point numbers Float: When the data is in range ±131072（65536×2） Inside ,float The data accuracy is correct , But data beyond this range is unstable , No relevant parameter setting suggestions are found ： take float Change to double perhaps decimal, The difference between the two is double Is a floating point calculation ,decimal It is fixed-point calculation , Will get more accurate data .

Let's use analysis , First create a test table

CREATE TABLE customer ( id int(11) NOT NULL AUTO_INCREMENT, name varchar(45) DEFAULT NULL, age int(11) DEFAULT NULL, jinqian float(5,2) DEFAULT NULL, PRIMARY KEY (id) );

float(m,d)
m Indicates the maximum length ,d Indicates the number of decimal places displayed .
For example, above sql in ：float(5,2) Express ： The maximum length of this floating-point number is 5, That's five , Then the decimal part is 2 position , As for the storage range , It depends on whether you define unsigned .
Unsigned words , The minimum is 0.0 Can store up to 99999.9, If there is a symbol , The scope is ：-99999.9 to 99999.9.
The default size is 24 Digit number , The accuracy is about 7 Digit number （ Tested as 6 position ）, When setting M Size greater than 24 when , Automatic conversion to DOUBLE type ; Simultaneous setting M and D Do not perform automatic conversion .

Decimal places exceed the set value , Save by rounding

INSERTINTO customer (id,name,age,jinqian)VALUES(111111111,'uu',15,90.012);
INSERTINTO customer(id,name,age,jinqian)VALUES(1111111111,'uu',15,90.018);

The above two are saved as

summary
From the above analysis , We can draw the following conclusion ：
1、 There is an error in floating-point numbers ;
2、 Data sensitive to precision, such as currency , It should be expressed or stored as a fixed-point number ;
3、 Programming , If floating-point numbers are used , Pay special attention to the error , And try to avoid floating-point comparison ;
4、 Pay attention to the handling of some special values in floating-point numbers ;

原网站

版权声明
本文为[RestCloud]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/177/202206261450456752.html

当前位置：网站首页>Inaccurate data accuracy in ETL process

Inaccurate data accuracy in ETL process

边栏推荐

猜你喜欢

随机推荐