当前位置:网站首页>About the two architectures of ETL (ETL architecture and ELT architecture)
About the two architectures of ETL (ETL architecture and ELT architecture)
2022-08-04 17:32:00 【Microservice mall technology sharing】
ETL, the abbreviation of English Extract-Transform-Load, is used to describe the process of extracting, transforming, and loading data from the source to the destination.The term ETL is more commonly used in data warehouses, but its objects are not limited to data warehouses.

ETL is an important part of building a data warehouse. The user extracts the required data from the data source, and after data cleaning, finally loads the data into the data warehouse according to the pre-defined data warehouse model.

ETL is mainly reflected in the following aspects in the process of conversion:
- Null value processing: It can capture the null value of the field, load it or replace it with other meaning data, and load it to different target libraries according to the null value of the field.
- Normalized data format: It can realize the definition of field format constraints. For data such as time, value, and character in the data source, the loading format can be customized.
- Split data: Fields can be decomposed according to business requirements.For example, the calling number is 861082585313-8148, the area code and telephone number can be decomposed.
- Verify data correctness: Lookup and split functions can be used to verify data.For example, the calling number is 861082585313-8148. After the area code and the phone number are decomposed, Lookup can be used to return the calling area recorded by the calling gateway or switch for data verification.
- Data replacement: For business factors, invalid data and missing data can be replaced.
- Lookup: Find missing data Lookup implements sub-queries and returns missing fields obtained by other means to ensure field integrity.
- Establish the primary and foreign key constraints of the ETL process: illegal data without dependencies can be replaced or exported to the wrong data file to ensure the loading of the unique records of the primary key.
The advantages of ETL architecture:
- ETL can share the load of the database system (using a separate hardware server)
- Compared with EL-T architecture, ETL can implement more complex data transformation logic
- ETL uses a separate hardware server..
- ETL has nothing to do with the underlying database data store.
ELT
In the ELT architecture, ELT is only responsible for providing a graphical interface to design business rules. The entire process of data processing flows between target and source databases. ELT coordinates related database systems to execute related applications and data.The processing process can be executed either on the source database side or on the target data warehouse side (mainly depending on the architecture design and data attributes of the system).When the ETL process needs to improve efficiency, it can be achieved by tuning the relevant database or changing the server that performs processing.General database vendors will strongly promote this kind of architecture, such as Oracle and Teradata are strongly promoting the ELT architecture.

Advantages of ELT architecture:
- ELT mainly realizes the scalability of the system through the database engine (especially when the data processing process is at night, the resources of the database engine can be fully utilized)
- ELT can keep all data in the database at all times, avoid data loading and exporting, thus ensuring efficiency and improving system monitorability.
- ELT can optimize parallel processing according to the distribution of data, and can optimize disk I/O by utilizing the inherent capabilities of the database.
- The scalability of ELT depends on the scalability of the database engine and its hardware server.
- It is generally not particularly difficult to obtain a 3 to 4 times efficiency improvement in the ETL process through performance tuning of the relevant database.
边栏推荐
- Cholesterol-PEG-Maleimide,CLS-PEG-MAL,胆固醇-聚乙二醇-马来酰亚胺一种修饰性PEG
- R语言glm函数使用频数数据构建二分类logistic回归模型,分析的输入数据为频数数据(多个分类指标对应的阴性样本和阳性样本的频数数据)、weights参数指定频数值
- hi, 请问下这是什么问题, 我看官网的example就是mysql的, 咋提示不支持?
- mysqlbinlog 超过500g自动删除,保留7个,求大深给个版本
- Flutter实战-请求封装(四)之gzip报文压缩
- yarn detailed introductory tutorial
- 【LeetCode每日一题】——374.猜数字大小
- JS中null与undefined的异同点
- js函数传参是按值传递还是按引用传递?
- ctfshow 萌新web1-21
猜你喜欢

Boost library study notes (1) Installation and configuration

谷歌开发者社区推荐:《Jetpack Compose 从入门到实战》新书上架,带你踏上 Compose 开发之旅~

吃透Chisel语言.32.Chisel进阶之硬件生成器(一)——Chisel中的参数化

What does the product system of a digital financial enterprise look like?

Boost库学习笔记(一)安装与配置

Understand Chisel language. 32. Chisel advanced hardware generator (1) - parameterization in Chisel

小程序笔记2

西西成语接龙小助手

小程序笔记1

荣耀发布开发者服务平台,智慧生态合作提速
随机推荐
Flutter实战-请求封装(四)之gzip报文压缩
对象实例化之后一定会存放在堆内存中?
Codeforces Round #811 (Div. 3)
Cholesterol-PEG-DBCO,CLS-PEG-DBCO,胆固醇-聚乙二醇-二苯基环辛炔科研试剂
Qt自动补全之QCompleter使用
2022年五一数学建模C题讲解
谷歌开发者社区推荐:《Jetpack Compose 从入门到实战》新书上架,带你踏上 Compose 开发之旅~
Fork/Join框架
Learning to Explore - Setting the Foreground Color for Fonts
SRM Supplier Collaborative Management System Function Introduction
基于clipboard.js对复制组件的封装
Liunx删除乱码文件
树莓派温度监视关机保护脚本
基于大学生内卷行为的调查研究
Json的FastJson与Jackson
DSPE-PEG-DBCO,DBCO-PEG-DSPE,磷脂-聚乙二醇-二苯并环辛炔科研实验用
【技术笔记】树莓派4B开机流程整理(无显示器安装)
R语言ggpubr包的ggtexttable函数可视化表格数据(直接绘制表格图或者在图像中添加表格数据)、使用ggarrange函数将表格数据和可视化图像组合起来(表格数据在可视化图像下方)
codeforces每日5题(均1600)-第二十八天
dotnet remoting 抛出异常