当前位置:网站首页>What is data mining?
What is data mining?
2022-06-26 06:43:00 【The code family】
1、 The concept of data mining
Data mining is from a large number of 、 Not completely 、 Noisy 、 Vague 、 In random actual data , Extract the implied , What people don't know in advance , But the process of potentially useful information and knowledge .
The data source used for data mining must be real and massive , And may be incomplete and include some interference data items . The discovered information and knowledge must be of interest and usefulness to the user . In general , The result of data mining is not required to be completely accurate knowledge , It's about finding a big trend .
Data mining can be simply understood as the operation of a large amount of data , The process of discovering useful knowledge . It is an interdisciplinary subject covering a wide range of fields , Including machine learning 、 mathematical statistics 、 neural network 、 database 、 pattern recognition 、 Rough set 、 Fuzzy mathematics and other related technologies .
2、 Application of data mining
For specific applications , Data mining is a process of using various analysis tools to find the relationship between models and data in massive data , These models and relationships can be used to make predictions .
Knowledge discovery in data mining , It is not to discover the truth that is universal , Nor is it to discover new natural science theorems and pure mathematical formulas , It is not a machine theorem proving . actually , All found knowledge is relative , There are specific premises and constraints , Oriented to a particular field , At the same time, it should be easy for users to understand , It is better to express the findings in natural language .
Data mining is actually a kind of deep-seated data analysis method . Data analysis itself has a long history , But in In the past , The purpose of data collection and analysis is for scientific research . in addition , Due to the limitation of computing power at that time , Complex data analysis methods that analyze large amounts of data are greatly limited .
3、 Value types of data mining
Data mining is to find valuable data in the mass of data , Provide basis for business decision-making . Value usually includes relevance 、 Trends and characteristics .
1) The correlation
Correlation analysis refers to the analysis of two or more variable elements with correlation , So as to measure the Degree of correlation .
Correlation analysis can only be carried out when there is a certain relationship or probability between elements . Correlation is not causality , The scope and field covered almost every aspect we have seen . Correlation analysis is used to determine changes between data , That is, whether the change of one or several attributes will affect other attributes , What is the impact . chart 1 These are examples of several common correlations .

2) trend
Trend analysis refers to the results that will actually be achieved , Compare with the historical data of similar indicators in the financial statements of different periods , To determine the financial position 、 An analytical method for the change trend and law of operating results and cash flow . The trend and trend of data can be predicted through the line chart , It can also be achieved through the link comparison 、 The results of the comparison are explained in a year-on-year manner , Pictured 2 Shown .

3) features
Feature analysis refers to finding the features of the main objects according to the specific analysis contents . for example , Internet data mining is to find out all aspects of the characteristics of users to portrait users , And according to different users, the user group will be labeled accordingly . Pictured 3 Shown .

边栏推荐
- typescript的class结合接口(interface)的简单使用
- My SQL(二)
- Guide to "avoid dismissal during probation period"
- Gof23 - prototype mode
- JS download pictures
- 获取当前月份的第一天和最后一天,上个月的第一天和最后一天
- Go learning notes 1.3- data types of variables
- 屏幕共享推荐
- Research Report on market supply and demand and strategy of Chinese amyl cinnamaldehyde (ACA) industry
- Failed to configure a DataSource: ‘url‘ attribute is not specified and no embedded datasource could
猜你喜欢

Kotlin compose state recovery remembersaveable and remember

Load balancer does not have available server for client: userservice problem solving

Mysql delete in 不走索引的
The sysdig 2022 cloud native security and usage report found that more than 75% of the running containers have serious vulnerabilities

Connexion et déconnexion TCP, détails du diagramme de migration de l'état

DS18B20 details

C# Nuget离线缓存包安装

浅析一道经典题

连接数服务器数据库报:错误号码2003Can‘t connect to MySQL server on ‘服务器地址‘(10061)

Interviewer: what is the difference between a test plan and a test plan?
随机推荐
Marketing skills: compared with the advantages of the product, it is more effective to show the use effect to customers
Past events of Xinhua III
MVC source code sharing
淺析一道經典題
C# Nuget离线缓存包安装
typescript的type
China micro cultivator market trend report, technical dynamic innovation and market forecast
我在腾讯做测试的这几年...
Play with a variety of application scenarios and share secrets with Kwai MMU
Go语言学习笔记 1.2-变量篇
Decompile Android applications, interview Android
Pytorch mixing accuracy principle and how to start this method
直播预告丨消防安全讲师培训“云课堂”即将开讲!
Use the fast proxy to build your own proxy pool (mom doesn't have to worry about IP being blocked anymore)
同步通信和异步通信的区别以及优缺点
SHOW语句用法补充
The four cores of the browser: Trident, gecko, WebKit, blink
Kotlin compose state recovery remembersaveable and remember
Laravel 实现 groupBy 查询分组数量
Installation and login of MySQL database