当前位置:网站首页>Six little-known SQL technologies in SQL tutorial can help you save 100 hours per month

Six little-known SQL technologies in SQL tutorial can help you save 100 hours per month

2022-06-22 00:01:00 Knowledge fatness

In eight years of data career , I rely on something simple but little known SQL Technology saves itself countless hours to perform analysis and build ETL The Conduit .

In this paper , I will share six over and over again :

  • Find and delete duplicate records from tables
  • Query the latest set of records from the table
  • Start on a monthly or weekly basis / End level summarizes daily data
  • Aggregate customization (CASE WHEN) Category data
  • Find today and yesterday in the same table ( Or any two dates ) Differences between
  • Merge data from one table into another table ( Simple method )
  • Monitor how many new records are added to the table every day .
  • Identify in “ snapshot ” New records added between two dates in the table

Find and delete duplicate records from tables

with x as (select *, row_number() over(partition by [key],[key],[key] order by [key]) as rowRank from {schema}.{table})
select * from x where rowRank > 1;

Nothing is worse than repetition . Terrible duplicate records have brought great pain to my data life cycle . Repetition can mess up almost any analysis or dashboard —— Especially those who don't simply DISTINCT Clause and disappear from the analysis or dashboard . There are many ways to identify duplicates —— But I found the above example to be the simplest .

Just wrap the main query in CTE in , Then after all the variables you want to check , Add one row_number function , This function partitions all table keys . Partitions must contain all table keys to function properly , Otherwise, you may classify non duplicates incorrectly .row_number What the function does here is rank all instances of the keys you provide . In your CTE after , Run a simple selection and filtering WHERE Your new row_number function Field is greater than the 1. The output will return all duplicate records - Because of anything rowRank > 1 The records of are in the table

原网站

版权声明
本文为[Knowledge fatness]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/172/202206211844585132.html