当前位置:网站首页>Introduction to machine learning (I): understanding maximum likelihood estimation in supervised learning
Introduction to machine learning (I): understanding maximum likelihood estimation in supervised learning
2022-07-25 07:48:00 【Jasper0420】
Introduction to machine learning ( One ): Understand the maximum likelihood estimation in supervised learning

1. Abstract
This article decrypts the machine learning modeling process in the context of Statistics . We will show you how assumptions about data enable us to create meaningful optimization problems . in fact , We will derive common criteria , Such as cross entropy in classification and mean square error in regression .
2. likelihood VS Probability and probability density
First , Let's start with a basic question : What is the difference between possibility and probability ? data x x x, Passing probability P ( x , θ ) P(x,\theta) P(x,θ) Or probability density function (pdf) P ( x , θ ) P(x,\theta) P(x,θ) Connect to possible models θ \theta θ .
In short , The probability density function gives the probability of occurrence of different possible values . The probability density function describes the infinitesimal probability of any given value . We insist on using pdf The symbol of . For any given set of parameters θ \theta θ , P ( x , θ ) P(x,\theta) P(x,θ) Aims to become x x x The probability density function of .
likelihood P ( x , θ ) P(x,\theta) P(x,θ) Is defined as the joint density of observed data , As a function of model parameters . It means , For any given x x x, p ( x = fixed , θ ) p(x=\operatorname{fixed},\theta) p(x=fixed,θ) Can be seen as θ \theta θ Function of . therefore , Likelihood function is only a parameter θ \theta θ Function of , The data remains a fixed constant .
What we will consider is , What we will consider is , We have to deal with a problem caused by m m m Data instances X X X aggregate X = { x ( 1 ) , . . , x ( m ) } X= \{ \textbf{x}^{(1)}, . . , \textbf{x}^{(m)} \} X={ x(1),..,x(m)}, Follow the empirical training data distribution p d a t a t r a i n ( x ) = p d a t a ( x ) p_{data}^{train}(\textbf{x}) = p_{data}(\textbf{x}) pdatatrain(x)=pdata(x), p d a t a r e a l ( x ) p_{data}^{real}(\textbf{x}) pdatareal(x) It is a good and representative sample of unknown and wider data distribution .
3. Independent and identically distributed hypothesis
This brings us ML The most basic assumption : Independent homologous distribution (IID) data ( A random variable ). Statistical independence means for random variables A and B, Joint distribution P A , B ( A , B ) P_{A,B}(A,B) PA,B(A,B)
To be continued ..... Busy recently , Come back and fill the pit when you have time
边栏推荐
- [software testing] package resume from these points to improve the pass rate
- In depth analysis of yolov7 network architecture
- [programmer 2 Civil Servant] I. Basic Knowledge
- uiautomator2 常用命令
- UNIPRO multi terminal deployment to meet customers' diversified needs
- 2-6.自动化采集
- Design a stack with getmin function
- cesium简介
- Kubernetes monitoring component metrics server deployment
- Use cyclegan to train self-made data sets, popular tutorials, and get started quickly
猜你喜欢

Oracle trigger creation

【Unity入门计划】界面介绍(1)-Scene视图

使用CycleGAN训练自己制作的数据集,通俗教程,快速上手

nanodet训练时出现问题:ModuleNotFoundError: No module named ‘nanodet‘的解决方法

曼哈顿距离简介

Line generation (matrix ')

【Unity入门计划】基本概念-2D碰撞体Collider 2D

Introduction to Manhattan distance

Cerebral cortex: the relationship between lifestyle and brain function in the elderly and its relationship with cognitive decline
P1086 [NOIP2004 普及组第二题] 花生采摘
随机推荐
App power consumption test
J1 common DOS commands (P25)
[unity entry program] basic concept trigger
diagramscene工程难点分析
Is the yield of financial products high or low?
Cache design in Web services (error allowed, error not allowed)
Check the computer restart times and reasons
The value of integer a after bitwise negation (~) is - (a+1)
How does MTK change the boot logo?
What has become a difficult problem for most people to change careers, so why do many people study software testing?
【微信小程序】全局样式、局部样式、全局配置
Gan series of confrontation generation network -- Gan principle and small case of handwritten digit generation
全新8.6版本SEO快排系统(可源码级搭建)
如何仅用递归函数和栈操作逆序一个栈
Kubernetes monitoring component metrics server deployment
P1047 [noip2005 popularization group t2] tree outside the school gate
Open source, innovators win, 2022 "science and innovation China" open source innovation list selection is fully open!
2-6. Automatic acquisition
Introduction to Manhattan distance
由两个栈组成的队列