当前位置:网站首页>Autumn move script C
Autumn move script C
2022-06-23 01:22:00 【Shallow look】
List of articles
Exercise one : Transfer line column
Suppose there are the following results
Create competition result table row_to_col
create table row_to_col
(cdate DATE,
result varchar(32) not null);
insert into row_to_col values(20210101,' - ');
insert into row_to_col values(20210101,' negative ');
insert into row_to_col values(20210103,' - ');
insert into row_to_col values(20210103,' negative ');
insert into row_to_col values(20210101,' - ');
insert into row_to_col values(20210103,' negative ');
Please use SQL Convert the game results into the following form :
Their thinking :
- according to cdate Group query results
- adopt coount、if Statement Statistics for each day ’ - ’、' negative ’ Sessions
SQL The statement is as follows :
select cdate as ' Date of competition ',
count(if(result=' - ',true,null)) as ' - ',
count(if(result=' negative ',true,null)) as ' negative '
from row_to_col
group by cdate
The operation results are as follows :
Exercise 2 : Column turned
Suppose there are the following results :
Create competition result table col_to_row:
create table col_to_row
( Date of competition date,
- integer(4) not null,
negative integer(4) not null,
primary key( Date of competition ));
insert into col_to_row values(20210101,2,1);
insert into col_to_row values(20210103,1,2);
Their thinking :
Time is limited , I have thought about this question for a long time, but I still haven't got a good idea , I'll have a chance to do it later , I'm going to leave a hole here .
Practice three : Continuous login
problem :
There is a user behavior record table t_act_records surface , Contains two fields :uid( user ID),imp_date( date )
- Calculation 2021 Every month of the year , Maximum number of consecutive login days per user
- Calculation 2021 Every month of the year , continuity 2 There is a list of logged in users every day
- Calculation 2021 Every month of the year , continuity 5 Number of users logged in every day
Create table t_act_records:
DROP TABLE if EXISTS t_act_records;
CREATE TABLE t_act_records
(uid VARCHAR(20),
imp_date DATE);
INSERT INTO t_act_records VALUES('u1001', 20210101);
INSERT INTO t_act_records VALUES('u1002', 20210101);
INSERT INTO t_act_records VALUES('u1003', 20210101);
INSERT INTO t_act_records VALUES('u1003', 20210102);
INSERT INTO t_act_records VALUES('u1004', 20210101);
INSERT INTO t_act_records VALUES('u1004', 20210102);
INSERT INTO t_act_records VALUES('u1004', 20210103);
INSERT INTO t_act_records VALUES('u1004', 20210104);
INSERT INTO t_act_records VALUES('u1004', 20210105);
Their thinking :
- Select any initial date less than the date in the table as the reference date , And make use of datediff The function calculates the time between the user's login date and the reference date Days between
- For different users , Number and sort their login dates , And calculate the steps 1 The difference between the number of days in between and this sort number , Write it down as ranking.
- It's not hard to find out , When the login date of a user is consecutive , Difference value ranking Will be the same .
- according to month 、 user name (uid)、 And ranking Grouping , Find all consecutive days of the month .
- Use... Based on consecutive days order by Sort in descending order , Find out the maximum number of consecutive login days .
Here the idea of solving the problem is to use the article for reference :mysql Continuous date statistics _MYSQL – Calculate the number of consecutive days
SQL sentence :
select month(imp_date) as ' month ',
uid,
min(imp_date)as ' Start date ',
max(imp_date)as ' End date ',
count(*) as ' Days in a row '
from (select uid,imp_date,
datediff(imp_date,'2020-01-01')-rank()over(partition by uid order by imp_date) as ranking
from t_act_records) as r
group by uid,month(imp_date),r.ranking
order by Days in a row desc
Running results , Get the number of consecutive login days for all users per month :
problem 2 and 3 Just add the following where The conditions are good :
It should be noted that , Here, you need to query the above query results as a new table , Otherwise, because sql Statement execution order from–where–select Why , Will cause the field to be missing ‘ Days in a row ’.
where p. Days in a row = 5
--
where p. Days in a row = 2
Exercise four :hive Causes of data skew and optimization strategies ?
reason :
1)、key Unevenly distributed
2)、 The nature of business data itself
3)、 I don't think well when I build my watch
4)、 some SQL Statement itself has data skew
Refer to the article for specific details :Hive Causes and solutions of data skew
Practice five :LEFT JOIN Whether there may be more rows ? Why? ?
This may lead to an increase in the amount of data .
function SQL sentence :
SELECT *
FROM A
LEFT JOIN B
on A.name = B.name
give the result as follows :
In this paper, the reference :
mysql Continuous date statistics _MYSQL – Calculate the number of consecutive days
Hive Causes and solutions of data skew
Detailed topic reference :
DataWhale Team learning
边栏推荐
- Psychological analysis of the safest spot Silver
- 62. different paths
- Which brokerage platform is better and safer for a brokerage to open an account on a mobile phone? What if you need a low commission
- How do beginners get started quickly and learn deeply?
- Huawei cloud recruits partners in the field of industrial intelligence to provide strong support + commercial realization
- 工程目录导航
- LINQ 查詢
- Get the direction of mouse movement
- C serializabledictionary serialization / deserialization
- A hundred lines of code to realize reliable delay queue based on redis
猜你喜欢

Installation record of ros1noetic in Win 11
![[launch] redis Series 2: data persistence to improve availability](/img/f4/5bc7ca3e17c6656e71df515182842e.png)
[launch] redis Series 2: data persistence to improve availability
![Found several packages [runtime, main] in ‘/usr/local/Cellar/go/1.18/libexec/src/runtime;](/img/75/d2ad171d49611a6578faf2d390af29.jpg)
Found several packages [runtime, main] in ‘/usr/local/Cellar/go/1.18/libexec/src/runtime;

Cadence spb17.4 - Allegro - optimize and specify the polyline connection angle of a single electrical line - polyline to arc

07 project cost management

3DMAX modeling notes (I): introducing 3DMAX and creating the first model Hello World

Daily question brushing record (I)

3D printing microstructure

E-R图

62. different paths
随机推荐
Graphite statsd interface data format description
3DMAX modeling notes (I): introducing 3DMAX and creating the first model Hello World
SAP ui5 application development tutorial 103 - how to consume the trial version of the third-party library in SAP ui5 applications
"Hearing" marketing value highlights, Himalaya ushers in a new situation
62. 不同路径
Fluentd is easy to use. Combined with the rainbow plug-in market, log collection is faster
Analysis on the wallet system architecture of Baidu trading platform
SYSTEMd summary
The road of architects starts from "storage selection"
New progress in the construction of meituan's Flink based real-time data warehouse platform
How about China International Futures Co., Ltd.? Is it a regular futures company? Is it safe to open an account online?
Prevent others from using the browser to debug
Vector 1 (classes and objects)
3D打印微组织
Population standard deviation and sample standard deviation
Daily question brushing record (I)
Quelle est la structure et la façon dont les données sont stockées dans la base de données?
Real topic of the 2020 Landbridge cup provincial competition - go square (dp/dfs)
3D printing microstructure
SAP ui5 application development tutorial 103 - how to consume third-party libraries in SAP ui5 applications