当前位置:网站首页>When transformer encounters partial differential equation solution
When transformer encounters partial differential equation solution
2022-06-27 00:01:00 【Shengsi mindspire】

This article shares with you the recent reading Transformer Solving partial differential equations Choose a Transformer: Fourier or Galerkin, The paper has been NeurIPS2021 receive .
Background introduction
In our world , From the motion of stars in the universe , To the weather forecast of temperature and wind speed , And then to the interaction between molecules and atoms , A lot of engineering 、 Natural science 、 Both economic and business processes can be solved by partial differential equations (PDE) describe . Traditional approach , Such as finite element 、 Finite difference 、 Spectral method, etc , Using discrete structure, the infinite dimensional operator mapping is simplified to a finite dimensional approximation problem . In recent years, physical information neural network (PINN) Wait for the model [1], By sampling in the solution space , Training neural networks to approximate PDE Explain . But for traditional methods or physical information neural networks, etc , A slight change in boundary conditions or equation parameters , It usually requires recalculation and training .
by comparison , The goal of operator learning is to learn the mapping between infinite dimensional function spaces , In this way, the partial differential equations can be solved without retraining , Thus greatly saving computing resources .PDE Operator learning in solving (operator learner) It is a new research direction with vigorous development at present , The typical representative is Fourier neural operator (FNO)[2].
With NeurIPS2021 Release of , be based on Transformer Operator learning articles 《Choose a Transformer: Fourier or Galerkin》[4] For parameterization PDE A new explanation is given for the solution of , In the end, we achieved state-of-the-art Result .
Main work
In this paper ,operator learner Use supervised learning training , Training samples are obtained by sampling input function and output function on the same discrete grid points , As shown in the figure below , The solution of the equation can be transformed into seq2seq Question and pass Transformer[3] Modeling .

chart 1 operator learner schematic
be based on Transformer The job of , The main contributions of this paper are as follows :
1. nothing softmax The attention mechanism of . Put forward scale-preserving Self attention mechanism and none softmax Of attention, The mathematical explanations of the two schemes are given .
2. A parameterized PDE Of operator learner. The new attention operator is compared with FNO Combine , Significantly improved in parameterization PDE Accuracy in solving benchmark problems .
3. State-of-the-art experimental result . In three benchmark in , The accuracy and performance of the solution are greatly improved .
Pipeline

chart 2 A two-dimensional operator learner Network structure
operator learner The network structure is shown in the figure above , It mainly includes the following modules :
1. Feature extractor (Feature extractor): One dimensional problems use feedforward neural networks 、 Two dimensional problems use CNN Network, etc ;
2. Interpolation based CNN(Interpolation-based CNN): On the sampling / Lower sampling layer and CNN The stack of gets ;
3. Location code (Positional encoding): The Cartesian coordinates of each grid point are connected to the input data as additional feature dimensions .
4. decoder (Decoder): The representation features learned by the encoder are mapped back to the original dimension .
Among them, network training loss Function as follows :

The main body of the loss function is the network output and label Between MSEloss, in addition loss Additional output and label Difference between regular terms .
among Fourier and Galerkin Type of Transformer The calculation method is as follows :

chart 3 Fourier Attention

chart 4 Galerkin Attention
experimental result
1. Burger’s equation
The equation is defined as follows :

The task in this article is from the initial moment (t=0) obtain t=1 The moment of solution u, Model and FNO The comparison of is shown in the following table , The accuracy of the results is better than that of the FNO.

2. Darcy flow problem
The equation is defined as follows :

The problem is defined from two-dimensional random geometry coefficients a, To a two-dimensional solution u Mapping . Model and FNO The comparison of is shown in the following table , The accuracy of the results is better than that of the FNO.

While comparing the accuracy of the model , The performance of the model is also compared , The comparison results are as follows , among Galerkin Attention The way of Transformer It has obvious advantages in memory occupation and performance .

Thinking and summary
Galerkin Transformer From a mathematical point of view Attention Mechanism , And it is introduced into parameterization by combining it with operator learning PDE To solve the problem , The accuracy and performance are better than those of operator learning “ Big brother ”FNO. Later, it can be used in higher dimensional and more complex scenes , Verify the validity of the model .
Reference
[1] Raissi M, Perdikaris P, Karniadakis G E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations[J]. Journal of Computational Physics, 2019, 378: 686-707.
[2] Li Z, Kovachki N, Azizzadenesheli K, et al. Fourier neural operator for parametric partial differential equations[J]. arXiv preprint arXiv:2010.08895, 2020.
[3] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Advances in neural information processing systems. 2017: 5998-6008.
[4] Cao S. Choose a Transformer: Fourier or Galerkin[J]. arXiv preprint arXiv:2105.14995, 2021.

MindSpore Official information
GitHub : https://github.com/mindspore-ai/mindspore
Gitee : https : //gitee.com/mindspore/mindspore
official QQ Group : 486831414
边栏推荐
- 【测试】最火的测试开发学习路线内容再次大更新,助力通关大厂测开
- Openpyxl module
- 当Transformer遇见偏微分方程求解
- [微服务]Eureka
- Technical dry goods | top speed, top intelligence and minimalist mindspore Lite: help Huawei watch become more intelligent
- 不会写免杀也能轻松过defender上线CS
- 50 tips that unity beginners can definitely use
- Why does EDR need defense in depth to combat ransomware?
- Microservices and container choreography in go
- Technical dry goods | what is a big model? Oversized model? Foundation Model?
猜你喜欢
随机推荐
“message“:“Bad capabilities. Specify either app or appTopLevelWindow to create a session“
Is it safe to open an account and speculate in stocks on the mobile phone? Is it safe to open an account and speculate in stocks on the Internet
Cvpr2022 stereo matching of asymmetric resolution images
kubeadm创建kubernetes集群
golang语言的开发学习路线
Amway! How to provide high-quality issue? That's what Xueba wrote!
Which securities dealers recommend? Is it safe to open an account online now?
In the Internet industry, there are many certificates with high gold content. How many do you have?
Pinpoint attackers with burp
test
Operations research says that in issue 66, Behrman also has "speech phobia"?
【测试】最火的测试开发学习路线内容再次大更新,助力通关大厂测开
【Try to Hack】正向shell和反向shell
No clue about complex data?
Installation of xshell and xftp
Development and learning route of golang language
手机上炒股开户可靠吗 网上开户炒股安全吗
安利!如何提优质的ISSUE?学霸是这样写的!
利用burp精准定位攻击者
Why does EDR need defense in depth to combat ransomware?









