当前位置:网站首页>How do super rookies get started with data analysis?

How do super rookies get started with data analysis?

2022-06-25 12:05:00 Halosec_ Wei

Working in data science is almost as fast 5 Years. , I have designed and developed two sets of data analysis platforms , Deep experience in business data analysis , I hate that kind of 《5 Tianxue data analysis 》 Or throw out a pile of skill lists and book lists .

We know , The essence of data analysis is to assist decision-making , We should try our best to find solutions to business problems through data analysis , Here today , I will apply the method in practice in combination with data science , Explain the real data analysis skills, the quick method and the business essence of data analysis , It's all about real dry goods , Small partners who are interested in truly mastering the skills of business data analysis must be patient , In this article, I will share a set of business data analysis framework for free , The two little friends of the professor rely on this set of analysis framework from novice to great God , As users grow 、 Accurate marketing and other business data analysis .

Those who don't want to read the text can only watch the video directly

How can a super rookie quickly learn to analyze data (12min)

 

---------------------------------- I'm a split line ---------------------------------

Direct dumping , A framework for business data analysis

 

Step1 carding 、 Understand the specific business logic diagram of the analysis object ;

Step2 collecting data , Build a portrait of the analysis object ;

Step3 Locate the business problem anchor , Analyze the differences in the portraits of the objects in the two stages and the similar behaviors or characteristics of the objects in the two stages ;

Step4 Make assumptions based on the above picture differences or similar behaviors in combination with business ;

Step5: By all means (ABTest/ Visit ) Test the hypothesis , Test the hypothesis with small scenarios and small trial and error

 

 

Business data analysis framework

normal , Now I will explain it in combination with my understanding of data science .

 

Step1: carding 、 Understand the specific business logic diagram of the analysis object ;

First, let's take a look at some real problems

l The company did an activity , The result is bad , Users do not place orders when they come , Data analyst how do you analyze ?

l This month, the company's online product sales fell precipitously , Data analyst how do you analyze ?

l The user feedback of the new product is good , Put in a lot of advertisements , But the sales volume can not be increased ?

l The exposure of a financial product has never been reduced , But the buyers have been losing , Why ?

Actually, let's take a closer look , Such questions are essentially about retention rates 、 Conversion analysis , We said earlier , The essence of data analysis is to assist decision-making , The essence of business data analysis is to analyze the rules of data , Get effective strategies for business problems . Retention rates like this 、 Analysis of conversion , In business, it is used for precision marketing 、 In terms of user growth , Only this kind of analysis , To bring cash to the company , Improve your competitiveness in the workplace . Return to the right topic , To do data analysis , First of all, you must fully understand the logic of the business . By combing 、 Understanding the specific business logic diagram of the analysis object is to sort out the business path or business flow path diagram for the real business we encounter , To complete this step , There are many theories or path frameworks in the industry , I've summed it up , Its core is mainly divided into two categories :

ô Path class model

Such as path analysis , Funnel analysis , Core positioning refers to the overlapping relationship between the previous node and the next node

 

For example, funnel analysis , It shows that the number of users of this node includes the number of users who click on this node

ô Hierarchical class model

Such as life cycle model ,RFM Model , The core positioning is that there is no overlapping relationship between the previous node and the next node

 

For example, life cycle model , During its formation period, the number of users of this node does not include the number of users who click on this node , Their previous up-down relationship was only hierarchical

In fact, the most common business sorting methods in business are funnel model and life cycle model , Especially the life cycle model , You can say that , The essence of user operation is to sort out the user life cycle model , The quality of business logic sorting directly affects the analysis in step 3 , Here to buy financial products A As an example , Use funnel analysis to sort out the business logic

 

Register to open a personal account → Save money online or send it by binding 、 Transfer accounts → Received a push financial message → Browse recommended financial products → Buy financial products

Step2: collecting data , Build a portrait of the analysis object ;

The core of the second part is to collect data , The approach is to build a label system for business objects , For example, our business here is to buy financial products , The business object is the bank card user under a certain platform .

We know that in recent years, user portraits 、 The word "label operation" is especially popular , Now, user portrait is always emphasized in precision marketing or user growth 、 Label operation , In fact, to put it bluntly, it is to build a label system for business objects , Then use the customer group selected by the screening logic circle , The interval distribution of customer group labels is the user portrait . The popular user portrait platform in the industry has a magic plan , The various ge IO、 A push, wait , But in essence , Their core role is management 、 For storing labels , Like many regional banks I have contacted, some of them take them directly excel Instead of making user portraits , therefore , In fact, the user portrait platform is worthless , Their real core competence is to build users ( The business object ) The ability of the label system , So most of the time, the contracts they sign on such platforms are human resources services , The user portrait platform is half sold and half given away , For example, the precision marketing project team of our company at the customer site actually works to help customers build a user label system , Therefore, building an indicator system is a necessary skill for data analysts , It is also the core competitiveness of data analysis posts , There are rumors in the industry : The data analysts who can't build the index system are all novice data analysts .

Usually I am divided into basic tags 、 Statistical labels 、 Feature tags 、 Model labels are the technical paths in the four general directions to build labels , The following is the index system scheme of the customer portrait of the banking industry I have built , Because the picture is too big , Here is the abbreviated version ,

 

Whether the label system is sound or not is the key to fine operation , If the labeling system is incomplete , Talk about data analysis , But the most comfortable thing is , quite a lot toC All companies have user portrait platforms , Ordinary business people can easily solve many business problems by using this analysis framework .

 

Step3: Locate the business problem anchor , Data analysis ;

The core of data analysis here is to analyze the differences in the portraits of the objects in the two stages and the similar behaviors or characteristics of the objects in the two stages .

For the difference in the portraits of the objects in the two stages , Look back at the funnel diagram , We found that the conversion rate of the last two stages fell precipitously , Buy financial products A The conversion rate is 3%, But the conversion rate of browsing and recommending financial products is 47%, There's only... Between them 6.3% Retention rate , There is a business problem in precision marketing , How to improve the conversion rate from browsing recommended financial products to purchasing ?

 

This problem also returns to the analysis of user portraits , Suppose the customer group browsing the recommended financial products is called A, Customers who buy financial products are called B, I summarize the analysis points into three points , Namely :

ô Customer group (A-B) With customers (B) Analysis of differences in user portraits

ô Customer group (A-B) What similar tags do user portraits have ( Behavior )

ô Customer group (B) What similar tags do user portraits have ( Behavior )

PS: The analysis ideas of general data analysts are mainly in the following aspects

from A Stage to B Stage , What characteristics do most users conform to ?

from A Stage track B Stage , Do most users have some of the same behavior ?

from A Stage to B Stage , Whether it is affected by different channel sources ?

from A Stage to B Stage , Whether it is influenced by different educational background ?

……

So it is essentially the three points I listed above , About this piece , I also accumulated experience , Take the first point as an example , Customer group (A-B) With customers (B) Analysis of differences in user portraits , We need to analyze the following four aspects :

ô Which tag is different ?

( Is there a gender difference or income difference between the two customer groups or something else ?)

ô What is the overall difference in the label ?

( There are great differences between the sexes ? Or income ?)

ô The user distribution of this tag is the specific interval that causes the difference

( What distribution ranges of income make the difference ?)

ô What is the specific difference in this interval ?

( Which distribution of income produces the greatest difference ?)

We can use some data models to analyze from these four points , The specific method of analysis is my old guard , There's a lot of content , Don't put it here. If one day the article can break through 1w Words of praise , I will make a chapter to explain how to use data modeling to solve the above problems 4 In terms of , It involves statistics and machine learning , Or I can release demo Code .

For the similar behaviors or characteristics within the respective objects of the two stages , This actually belongs to the category of interpretable machine learning , Generally speaking, we can use a classification model to solve , For example, to study A A similar behavior or characteristic in a portrait of a guest group , We can build a classification model , The input can be a label , The output is divided into positive samples and negative samples , The positive sample is A The user ID of the customer group portrait , The negative sample is the non - A The user ID of the customer group portrait user , This is also called similar population expansion , The hit probability generated by this kind of model can also be used as a derived label , However, it is difficult to extract internal similar behaviors or features , However, some interpretable machine learning such as decision tree can be used to obtain its objective internal similar behavior or characteristics , This is also called the screening condition for constructing the portrait of the guest group . I will also open a new chapter on how to realize , And supplemented by cases ( Silently ask for a favor ).

 

Step4: Make assumptions based on the above picture differences or similar behaviors in combination with business ;

The design assumption mainly starts from the differences of the portraits , In the previous step, the user's similar behavior also said that its core role is to expand similar users ( Expand the users of the launch ). According to our business path , We can select the customer groups of certain two stages to explore their differences , Solve their conversion rate problem . Back to the original example , Suppose the bank app to 100w Individuals push the advertisement of financial products , Only 2w I clicked on the advertisement , This is a hierarchical model (step1 describe ), Because it can be divided into two customer groups : Customer group A(98W A person who didn't click on the advertisement ), Customer group B(2W People who click on the financial advertisement ), adopt Step3 Analysis of , Suppose we find that there is a difference in the label for gender , income , Age ...... , The difference quantification value is divided into 0.98,0.74,0.71....., We can find customers A The proportion of men and women is 0.5、0.5, Customer group B The proportion of men and women is 0.97,0.03, Assumptions can be made : Boys are more likely to accept the advertisement , Be more interested in this financial product , The strategy of operation can be : Push the financial management strategy to male bank users

 

Step5: By all means (ABTest/ Visit ) Test the hypothesis , Test the hypothesis with small scenarios and small trial and error .

In this process, what we usually do is to test the hypothesis , There are usually two ways to test hypotheses , One is to visit and verify , Go deep into front-line users to understand their needs , Test the hypothesis we put forward above ; Second is A/B Test, That is, it is used to verify assumptions through traffic distribution , Let's focus on the common speaking ,ABTest It is a set of methodology to compare the good and bad of two things . For example, a platform divides different users into different groups , Test different solutions at the same time , Find out which scheme is better through the real data fed back by users . The solution is “ A variety of options require a pat on the head to determine which is the better problem ”.

ABTest Its predecessor was randomized controlled trials - Double blind test , yes 「 Medical care / Biological experiment: random grouping of subjects , Different interventions were given to different groups , The effect of comparison 」. In the double-blind test, patients were randomly divided into two groups , Give placebo and test medication respectively without knowing , After a period of experiment, we will compare whether there is a significant difference between the two groups of patients , So as to determine whether the test medication is really effective . 2000 Google engineers conducted the first AB Test, Try to determine the best number of results to display on the search engine results page . later AB Testing continues to evolve , But the foundations and basic principles usually remain the same ,2011 year , After Google's first test 11 year , Google did 7000 Many different AB test .

40 Minutes of entry-level user operation ( Life cycle model , With policy data analysis framework )

Overview of hypothesis testing model (T test 、z test , Chi square test , Analysis of variance, etc 18 There are three model differences )

Customer life cycle model and label construction - Stream The article - You know

Improved bank customer value analysis (RFM) - Stream The article - You know

As users grow 、 Precision marketing and other business data analysis , User portraits are the most important , Today's data analysis framework is actually the most basic , The core is also the label system ( User portrait ), This is the embodiment of the ability of data analysts. If there is no solid and complete user portrait label system on the platform , one can't make bricks without straw , Talk about data-driven decision making , So as to increase users 、 Precision marketing . In fact, the data analysis model involved in this framework is only difference analysis ( Hypothesis testing ).

Sum up , Super rookies want to learn data analysis , Based on this data analysis framework , It only takes you a few hours to test the hypothesis 18 A statistical model to understand how to use .


Add : User growth 、 Precision marketing is usually presented in two ways in the industry :

1, No data . Partial product operation 、 Increasing the direction of hackers , Its core starts from the needs of users , What users need , You can design any function for him , Take a chestnut soul, Momo and other opposite sex dating software , Users just want to chat with the opposite sex (liao) God (sao), If you build a few heterosexual robots to meet the user requirements, you can increase the number of users ( To make fun of , Consider the feasibility of the function in the production environment )

2, With data . Biased towards data operation , Its core is to build user portraits , Mining operation strategy points by analyzing customer group differences , This is also the normal practice of orthodox business data analysts , This article explains the analysis process of this piece

 

In addition, I will share with you the tools for learning data analysis models , I have shared the templates of hundreds of data analysis models summarized from hundreds of data analysis and consulting projects ( You can download it by yourself MPai Data analysis platform learning ).

MPai Official website of data science platform ​www.mpaidata.com

原网站

版权声明
本文为[Halosec_ Wei]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202200535108700.html