当前位置:网站首页>Difference between data index and label system of data warehouse
Difference between data index and label system of data warehouse
2022-07-24 00:20:00 【000X000】
Let's start with a popular example :
For example, we need to introduce Mr. Chen , There are three ways to say it :
indicators : Teacher Chen is tall 180cm, weight 200 Jin
label : Mr. Chen 1 rice 8, Fat man
label : Teacher Chen , Black Whirlwind Li Kui heard not ?
This is the intuitive difference between labels and indicators . Data indicators , It is an accurate description of things with data . For example, height 、 weight 、 The waist 、 Arm length , These are data indicators . label , Is based on raw data processing , A general description with business implications . One “ Fat man ”, It summarizes height and weight at the same time , and “ Looks like Li Kui ”, It is more about facial features 、 figure 、 Temperament and other characteristics are summarized .
indicators VS label
obviously , By contrast , Describe things with data indicators , Will be more accurate . But labels are equally important . Because except for “ accurate ” outside , People still have more needs .
First , Not all features can be described by one data index . Common indicators , Usually continuous variables ( For example, height 183cm) Or ordered variables ( Risk level ABCDE). There are also a lot of features , It exists in the form of classification variables . For example, product specifications (50ml A bottle )、 Color ( Red orange yellow green )、 purpose ( such as : Home health care 、 Outdoor protection ……) These commodity features , It is generally described in the form of labels , This is also “ label ” The earliest source of this word .
secondly , Labels have business implications . For example, just two indicators : height 183、 weight 200 Jin , People don't feel much after listening , But once labeled : height 183+ weight 200, Very burly / height 183+ weight 200, Fat man . Is there a sense of picture in my mind immediately .
Last , Tags are easier to use by businesses . Introduction object , say “ Let me introduce a little Lori to you ”, Far ratio “ Let me introduce a height 153 weight 85 Here's your girl ”, It is easier to promote the next step . That's the charm of labels .
therefore , The construction of label system is very important , It can not only enrich the material of data analysis , It can directly promote the landing of analysis results .
What are the labels
There are four types of labels
1、 Basic feature labels
2、 Rule calculation label
3、 Comprehensive calculation label
4、 Model prediction tag
The four categories are introduced as follows

Quite a number of enterprises , There is no system to sort out labels , There are a lot of scattered basic feature labels . Some business departments will habitually propose rules / Label of comprehensive calculation , But there is no consensus with other departments , Resulting in poor versatility . All these restrict the role of labels .
Well done , What role can labels play ?
Typical tag usage scenarios
One : Query information . This is the most common scenario . A large number of front-line staff will have needs , For example, customer service 、 sales 、 after-sales 、 Copywriter , Can pass the label , Quickly find the corresponding products 、 Customer 、 Activity information , Improve work efficiency . And the tag used for query does not need to be very complex , Basic feature label is enough .
Two : Analyze the material . For example, do funnel analysis , notice A Channel comparison B Good channel transformation , But how to explain ? At this time, a series of tags can be introduced . such as
Channel label : Public domain 、 Public private domain 、 Vertical private domain
Copytag : Product knowledge 、 Offers 、 Personal sharing
Product label : Flow payment 、 Hot style 、 Profits 、
Discount label : Large discount 、 in 、 Small
With these labels , In the interpretation “ Why is the conversion rate high ” When it comes to problems , There are many more analysis clues . Through classification and comparison , track , test , You can see which label combination has the highest conversion rate . Compared with just looking at the conversion rate 、 Every page UV These data are much easier to use .

another : quite a lot toB Class analysis is very superficial , It's because there are too few labels . To the customer 、 Negotiation 、 Know nothing about the delivery process , Only know : The customer hasn't signed yet , The customer hasn't paid for the contract for three months . Of course, this analysis can't go on
3、 ... and : Strategy making . When formulating strategies , There are often fixed target customers 、 Target product 、 Target channel . For example, on customer problems , Sleepy user activation 、 Lost users retain 、 Risk user management , Is a common fixed theme . Now , Use fixed labels , For example, risk level ABCDE, It's much easier than getting the rules temporarily every time . and , It can be blessed by algorithm model , Constantly improve the accuracy of labels . This is the advanced application of tags .
Advanced applications , It needs comprehensive calculation 、 Model calculation class complex label . On the construction route , There are significant differences between the label system and the data index system . Data index system construction , Focus on : comprehensive . In a business scenario , Collect as many data indicators as possible , The more data indicators, the better . And label system construction , Focus on : Orderly 、 It works . Focus on one business goal , Put as many pieces as possible 、 Label of original description , Combine tags that are useful for business . Labels are more refined than more , Label quality is very important .
How to improve label quality
Compared with the data indicators , Label quality is inherently a problem . Because labels are produced manually , Add people's subjective judgment . It is likely that the description of the label is not accurate , The data source of the production label cannot express the meaning of the label well , So it's a misjudgment . We often say :“ Don't label people ”, Just worry about misjudgment at first sight , It interferes with the judgment of the whole person .
therefore , The use of labels , It is very different from the data indicators . Once the data indicators are sorted out , Unless the process changes , Otherwise, it won't change much . And the label is in the process of construction , Need to focus on the same goal , Constantly optimize , There is an obvious “ Purification ” The action of .
Purification is based on clear goals . For example, I want to make a label : High potential users . if “ I want to know which users have high potential ”, This is nonsense ! The correct expression is :“ After I know which users have high potential , I can offer them a more expensive assortment , Their response rate is higher , My investment cost is lower ”. In this way, use the tag scenario , The differences in data are clearly stated , Is a good goal .
When you have a goal , It can be downloaded from 0 Start construction . In the early days of construction , Often there are only scattered basic characteristics . At this time, you can directly use the basic features ; Or do exploratory analysis , Look at the characteristics of users who meet the target ; Or just slap your head , List a few rules . All in all , An initial label rule can be adjusted , Then you can iterate step by step . As long as we find : The distinguishing effect of labels is becoming more and more obvious .
边栏推荐
- My meeting of OA project (query)
- OA项目之我的会议(查询)
- GBase 8c 访问权限查询函数(三)
- Gbase 8C access authority query function (I)
- Take stock of 10 new layer1 to prepare for the next bull market
- The differences between text and image drawing, data storage, localstorage, sessionstorage, and cookies
- Redis cluster construction (cluster cluster mode, fragment cluster)
- Try new methods
- Redis 主从、哨兵、集群架构有缺点比较
- Redis master-slave synchronization mechanism
猜你喜欢

The name in Qiankun subapplication package.json becomes the default path

数仓数据指标和标签体系区别

Esp8266 - at command + network transparent transmission

NGFW的Portal认证实验
![[attack and defense world web] difficulty five-star 15 point advanced question: bug](/img/24/4a7f074aac9a08130cf215f0c39b57.png)
[attack and defense world web] difficulty five-star 15 point advanced question: bug

php实现 Stripe订阅

泛型机制和增强for循环

Inode, soft link, hard link

inode、软链接、硬链接
![[wechat applet] design and interactive implementation of auction product details page (including countdown and real-time update of bids)](/img/b5/dd4316b83ef4b80c36b532de658bb2.png)
[wechat applet] design and interactive implementation of auction product details page (including countdown and real-time update of bids)
随机推荐
Material return on investment
Scheme for importing XMIND use cases into tapd (with code)
投资的物质回报
理解多态,让不同的“人”做同一件事情会产生不同的结果
学习的快乐很多
paypal订阅流程及api请求
[wechat applet] design and interactive implementation of auction product details page (including countdown and real-time update of bids)
Gbase 8C access authority query function (V)
分布式之 CAP 原则
数仓数据指标和标签体系区别
What are blue-green deployment, Canary release and a/b test
Ansible command auto completion
GBase 8c 会话信息函数(五)
Adaptation scheme of large screen visualization
Pytest interface automation testing framework | multi process running case
Redis cluster construction (cluster cluster mode, fragment cluster)
GBase 8c模式可见性查询函数(二)
蓝绿部署、金丝雀发布、A/B测试是什么
【Android Kotlin】Property、Getter 和 Setter
Gbase 8C access authority query function (6)