当前位置:网站首页>[regression analysis] understand ridge regression with case teaching
[regression analysis] understand ridge regression with case teaching
2022-06-25 12:06:00 【Halosec_ Wei】
1、 effect
Ridge regression is a biased estimation regression method for collinear data analysis , In essence, it is an improved least squares estimation method , By giving up the unbiasedness of least squares , To lose some information 、 It is more practical to obtain the regression coefficient at the cost of reducing the accuracy 、 More reliable regression methods , The fitting of ill-conditioned data is better than the least square method .
2、 Input / output description
Input : The independent variables X At least one or more quantitative or categorical variables , The dependent variable Y Quantitative variables are required ( If it is a variable of fixed class , Please use logistic regression ).
Output : The result of model test goodness , Linear relationship between independent variable and dependent variable, etc .
3、 Learning Websites
SPSSPRO- Free professional online data analysis platform
4、 Case example
Case study : Through independent variables ( Room area 、 Floor height 、 House unit price 、 Is there an elevator 、 Number of schools around 、 From the subway station ) Fitting the predicted dependent variable ( housing price ), Now we find that there is a strong collinearity between the unit price of the house and the floor height ,VIF The value is higher than 20; The common least square method cannot be used OLS regression analysis , Ridge regression model is required .
5、 Case data

Ridge regression case data
6、 Case operation

Step1: New analysis ;
Step2: Upload data ;
Step3: Select the corresponding data to open and preview , Click start analysis after confirmation ;

step4: choice 【 Ridge return (Ridge)】;
step5: View the corresponding data format ,【 Ridge return (Ridge)】 The argument is required X At least one or more quantitative or categorical variables , The dependent variable Y Quantitative variables are required .
step6: Click on 【 To analyze 】, Complete the operation .
7、 Output result analysis
Output results 1: Ridge trace figure

Chart description : Through ridge trace map , determine K value .K The selection principle of value is the minimum when the standardized regression coefficient of each independent variable tends to be stable K value . But the ridge parameters determined by the ridge trace analysis method k To some extent, it is subjective and artificial ,psspro The method of variance expansion factor is used to automatically determine K=0.162.
Output results 2: Results of ridge regression analysis

*p<0.05,**p<0.01,***p<0.001
Chart description : The results of ridge regression show that : Based on field area 、 floor 、 The unit price 、 Number of schools around (1km)、 Distance from subway station (km)、 Significance of the regression model of the supporting elevator The value is 0.000***, The level is significant , Rejection of null hypothesis , It shows that there is a regression relationship between independent variables and dependent variables . meanwhile , Goodness of fit of model ² by 0.956, The model is relatively excellent , Therefore, the model basically meets the requirements .
The formula of the model :
The total price =-64.72+0.987 × area -0.043 × floor +0.008 × The unit price -0.447 × Number of schools around (1km)-4.198 × Distance from subway station (km)-3.674 × Supporting elevator r/&gt;<br/> Output results 3: Model path diagram

Chart description : The above figure shows the results of this model in the form of a path diagram , It mainly includes the coefficients of the model , The formula used to analyze the model .
Output results 4: Model result diagram

Chart description : The figure above shows the original data diagram of this model in a visual form 、 Model fitting value .
8、 matters needing attention
- Generally, before making the ridge return , First use linear regression ( Least squares regression ), If you find an argument VIF( Collinearity ) Too big , Exceed 10, Just use ridge regression ;
- SPSSPRO The variance expansion factor method is used to automatically find K value ;
- selection k The general principle of value is :
- The ridge estimation of each regression coefficient is basically stable
- The regression coefficient with unreasonable sign estimated by the least square method , The sign of its ridge estimation becomes reasonable
- There is no absolute value of the regression coefficient that does not accord with the economic significance
- The sum of squares of residuals does not increase much
9、 Model theory
Ridge return (Ridge Regression) It is a kind of regression method , It belongs to statistical method . stay machine learning Also known as weight attenuation . Some people call it Tikhonov Regularization . Ridge regression mainly solves two problems : One is when the number of predicted variables exceeds the number of observed variables ( Predictive variables are equivalent to characteristics , The observed variable is equivalent to the label ), Second, the data sets have multicollinearity , That is, there is correlation between the prediction variables .
General , Regression analysis ( matrix ) Form the following :

In general , The objective of using the least square method to solve the above regression problem is to minimize the following formula :

Ridge regression is to add a penalty item to the above minimization goal :

there λ It is also a parameter to be determined . in other words , Ridge regression is a least square regression with two norm penalty .
10、 reference
[1] Liu chao , regression analysis —— Method 、 Data and R Application , Higher Education Press ,2019
边栏推荐
- 实现领域驱动设计 - 使用ABP框架 - 系列文章汇总
- Using DBF of VFP to web salary query system
- JS monitors the width and height changes of div
- R语言dplyr包filter函数过滤dataframe数据中指定数据列的内容不是(不等于指定向量中的其中一个)指定列表中的数据行
- Why can't you Ping the website but you can access it?
- 黑马畅购商城---1.项目介绍-环境搭建
- Caused by: org. xml. sax. SAXParseException; lineNumber: 1; columnNumber: 10; Processing matching '[xx][mm][ll]' is not allowed
- apple 为什么要改 objc_msgSend 的类型申明
- Web project development process
- 剑指 Offer II 091. 粉刷房子 : 状态机 DP 运用题
猜你喜欢

Explain websocket protocol in detail

Eureka accesses the console and reports an error: whitelabel error page

Detailed explanation of Flink checkpoint specific operation process and summary of error reporting and debugging methods

How terrible is it not to use error handling in VFP?

揭秘GaussDB(for Redis):全面对比Codis

ROS 笔记(06)— 话题消息的定义和使用

为什么ping不通网站 但是却可以访问该网站?

2022年首期Techo Day腾讯技术开放日将于6月28日线上举办

Black Horse Chang Shopping Mall - - - 3. Gestion des produits de base

Actual combat summary of Youpin e-commerce 3.0 micro Service Mall project
随机推荐
黑马畅购商城---1.项目介绍-环境搭建
Startups must survive
2020最新最全IT学习线路
ROS 笔记(06)— 话题消息的定义和使用
Cesium editing faces
黑马畅购商城---3.商品管理
Sentinel integrated Nacos data source
Tool usage summary
20、wpf之MVVM命令绑定
R语言使用构建有序多分类逻辑回归模型、epiDisplay包的ordinal.or.display函数获取有序logistic回归模型的汇总统计信息(变量对应的优势比及其置信区间、以及假设检验的p值)
为什么ping不通网站 但是却可以访问该网站?
JS indexof() always returns -1
flutter常用命令及问题
Using DBF of VFP to web salary query system
R语言使用glm函数构建泊松对数线性回归模型处理三维列联表数据构建饱和模型、epiDisplay包的poisgof函数对拟合的泊松回归模型进行拟合优度检验(检验模型效果)
图片打标签之获取图片在ImageView中的坐标
网络上开户买股票是否安全呢?
Use of JSP sessionscope domain
WebRTC Native M96 基础Base模块介绍之实用方法的封装(MD5、Base64、时间、随机数)
数据库系列:MySQL索引优化总结(综合版)