当前位置:网站首页>Study on correlation of pumpkin price and design of price prediction model based on BP neural network

Study on correlation of pumpkin price and design of price prediction model based on BP neural network

2022-06-25 09:12:00 Zitian

With the all-round victory of the national anti-poverty campaign , “ Agriculture, rural areas and farmers ” The focus of work has shifted to comprehensively promoting rural revitalization .2021 As the first year for consolidating the achievements of poverty alleviation and Rural Revitalization , The relevant departments of the autonomous region are actively planning and carrying out the work related to the Rural Revitalization Strategy . However , In implementing the Rural Revitalization Strategy 、 While promoting the digital economy of Agriculture , There are still some problems in the rural agricultural production environment , for example : The classification and identification of crop diseases and insect pests adopt off-line diagnosis method, which has low timeliness ; The data-driven agricultural planting model has not been widely applied and popularized, resulting in the disconnection between output and market demand . This track will focus on the problems existing in farmers' production , Excellent results obtained through this competition , Provide reference countermeasures for Rural Revitalization .

The first part : Data preprocessing method :

1. Dataset loading , Results show :

After studying the competition questions 、 Discuss and discuss the correlation of various factors to pumpkin price , It was decided that 2016-2017 The pumpkin data of each county and city in the annual data set are collected into a data set , Select packages that are highly relevant in the data set 、 Varieties 、 Place of Origin 、 temperature 、 Precipitation and average price are reserved as training data set for pumpkin price correlation study , To support model building and training . It is convenient for neural network processing , The data preprocessing adopts independent heat coding (One-Hot Encoding) Handle , Encode data features . After the unique heat code , It becomes a binary characteristic ; also , These features are mutually exclusive , Only one activation at a time . therefore , The data becomes sparse , It solves the problem that the classifier can't deal with attribute data ; To a certain extent, it plays the role of expanding features .

Preprocessing datasets ( Local )

Dataset labels

Scatter chart of correlation between origin and price

Scatter diagram of the correlation between packaging method and price

Scatter chart of variety and price correlation

Scatter diagram of temperature and price correlation

Scatter chart of correlation between precipitation and price

As you can see from the picture , All factors have a great impact on the price , So in training prediction model BP The neural network will include the above factors as factor molecules to build the input matrix .

  1. Correlation matrix observation , Results show :

   The diagonal of the thermodynamic diagram is itself , The correlation is 1; It can be seen from the figure that the influence of factors on prices is highly correlated .

Heat map of the correlation between various factors and prices

  1. missing data 、 Outlier observation :

We deleted the poor correlation and a large number of missing data in the original data set ; Delete the data rows with missing data , Because there are few missing data , It has little effect on the overall training results . Outliers deviate from the actual normal market , So the elimination process .

The second part : Model building and analysis :

  1. Model concept : According to the requirements of the competition , We chose to use BP neural network .BP neural network It is trained according to the error back propagation algorithm Multilayer feedforward neural network , It is one of the most widely used neural network models .BP The basic idea of the algorithm is Gradient descent method , Using gradient search technology , In order to minimize the error mean square deviation between the actual output value and the expected output value of the network .BP neural network The learning process consists of two processes: forward propagation of signal and back propagation of error , Input from the input layer , After hidden layer treatment , To the output layer . If the actual output of the output layer does not match the expected output , It enters the back propagation stage of error . Error back propagation is the back propagation of the output error through the hidden layer to the input layer in some form , And the error is distributed to all units of each layer , So as to obtain the error signal of each layer unit , This error signal is used as the basis for correcting the weight of each unit .

    The neural network has strong nonlinear mapping ability and flexible network structure ; With self-learning and self-adaptive ability : Ability to apply learning outcomes to new knowledge ; Have certain fault tolerance ability . These advantages make BP Neural networks are very suitable for nonlinear regression problems ; There are many influencing factors for this topic 、 The non-linear problem of more data in datasets , Use BP The neural network meets the requirements of the topic .

  1. Model structure :BP The network is to add several layers of neurons between the input layer and the output layer , These neurons are called hidden units , Each layer of neurons can have several nodes , Neurons have no direct connection with the outside world , But the change of its state affects the relationship between input and output .

 BP Neural network structure diagram

  1. Model features :

    BP Neural networks in theory 、 The performance is mature . Its outstanding advantages and disadvantages are as follows :

advantage :

1. It has strong nonlinear mapping ability and flexible network structure . Any nonlinear continuous function can be approximated with any precision ; This feature is suitable for solving problems with complex internal mechanisms .

2. With self-learning and self-adaptive ability :BP Neural network can automatically extract input through learning in the training process 、 Between output data “ Hidden rules ”, And adaptively memorize the learning content into the weights of the network , Improve training accuracy .

3. Ability to apply learning outcomes to new knowledge ; That is, when designing a pattern classifier , It is necessary to consider that the network is ensuring the correct classification of the required classification objects , Also care about the network after training , Whether it can be applied to the unseen mode or the mode with noise pollution , Correct classification .

4. Have certain fault tolerance ability ; stay BP The destruction of local or partial neurons in the neural network will not have a great impact on the global training results , That is to say BP The neural network can still work normally when it receives local damage , Get training results .

shortcoming :

1. Problems with local minimization ; From a mathematical point of view , Conventional BP The neural network will fall into local extremum , This leads to the failure of network training .

2. It has the problem of slow convergence ; because BP Neural network algorithm is essentially gradient descent method , The objective function to be optimized is very complex , Make the convergence speed slow .

3. The choice of network structure is different ;BP So far, there is no unified and complete theoretical guidance for the selection of neural network structure , Generally, it can only be selected by experience .

4. The contradiction between prediction ability and training ability :BP Neural networks have a training prediction limit , When this limit is reached , With the improvement of training ability , The ability to predict will decline , That is, the so-called “ Over fitting ” The phenomenon .

The third part : Data analysis :

  1. Output results , Display the accuracy curve and prediction results :

Test set price forecast curve

Verify the set price forecast curve

Training error curve

  1. Analysis of data results :

according to BP The results of neural network model training and prediction can be seen , The prediction model reaches... In training times 1200 The prediction error can be reduced to 15.2 about , The error of prediction results is large , But the prediction result of the prediction model is stable , After optimization and training, the prediction error can be further reduced , Form a better BP Neural network prediction model .

The fourth part : The value and innovation of the work :

1. The value of the work : Through big data technology 、BP With the help of neural networks , The trained prediction model can predict the future price of pumpkin in a certain period of time , Provide reference for farmers to buy and sell pumpkins ; We have also made a general generalization design , Based on this model, you can continue to train, learn or transform , So that the model can be applied to practical applications , Or predict other practical problems , It has certain universality . We verified BP Neural network is used to predict the correlation between pumpkin price and other factors , Find out BP The network has some limitations .BP Neural network needs a lot of basic data support to make the prediction results relatively accurate . During the training, you should pay attention not to let BP Neural networks appear “ Over fitting ” The phenomenon .

2. innovative :

(1) The data preprocessing adopts independent heat coding . The method is to use N Bit status register to N States to code , Each state has its own register bit , And at any time , Only one of them works . After the unique heat code , It becomes a binary characteristic ; also , These features are mutually exclusive , Only one activation at a time . therefore , The data becomes sparse , It solves the problem that the classifier can't deal with attribute data ; To a certain extent, it plays the role of expanding features .

(2) The model uses BP neural network . The neural network has strong nonlinear mapping ability and flexible network structure ; With self-learning and self-adaptive ability : Ability to apply learning outcomes to new knowledge ; Have certain fault tolerance ability . These advantages make BP Neural networks are very suitable for nonlinear regression problems ; Although complicated BP The convergence speed of neural network will be slow , But thanks to the flexible network structure , In case of good linearity , Only a few hidden layers and neurons are needed to achieve good training results , It saves a lot of training time , Increase of efficiency . There are many influencing factors for this topic 、 The non-linear problem of more data in datasets , Use BP The neural network meets the requirements of the topic .

原网站

版权声明
本文为[Zitian]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202200555541578.html