当前位置:网站首页>Literature reading: gopose 3D human pose estimation using WiFi
Literature reading: gopose 3D human pose estimation using WiFi
2022-07-24 19:12:00 【Gone forever communication er】
motivation : Why does the author want to solve this problem ?
- Previously based on Wi-Fi Of 3D Human posture estimation has the following defects :
- It is only applicable to pose in a fixed position [1]
- Only predefined activities are allowed [2]
contribution : What has the author done in this paper ( Innovation points )?
Challenge
- And USRP or FMCW RADAR Different , From ready-made Wi-Fi Channel state information exported by the device CSI Data does not provide any spatial information of the human body ( How to understand spatial information ?AoA、AoD And so on. )
- How to make the human posture estimation system independent of its operating environment ?
- How to 2D AoA Spectrum and human body 3D Modeling complex relationships between bones
Solution
- From the nonlinear spacer antenna 2D AoA spectrum , And the spatial diversity of the transmitter and Wi-Fi OFDM The frequency diversity of subcarriers is combined , In order to improve the 2D AoA The spatial resolution of , To distinguish signals reflected from different parts of the human body
- From the spectrum extracted when one or more users perform activities Minus the static environment Of 2D AoA spectrum
- 2D AoA Spectrum as input , be based on CNN and LSTM Infer human body 3D Posture .CNN Extract spatial features ,LSTM Extract temporal features
precision
- GoPose In all kinds of situations ( Including activities to track dark conditions ) and NLoS In this scenario, about 4.5 cm The accuracy of ( The accuracy is MPJPE?? Should be yes )
planning : How they get the job done ?
The overall architecture

WiFi Probing: Collect data , utilize Linear fitting Denoise
Data Processing: First Space diversity and frequency diversity ( Later, we will introduce it in detail ) Combination , To improve two-dimensional AoA The resolution of the , To distinguish signals reflected from different parts of the human body ; then The static signal reflected from the indoor environment is filtered through static environment removal ; Last Combining multiple packets 2D AoA Spectrum as the input of the network
3D Pose Constrction:CNN It is used to capture the spatial features of human parts , and LSTM Used to estimate the temporal characteristics of motionImprove two-dimensional AoA The resolution of the , Spatial diversity and frequency diversity
1D AoA It is estimated that there is not much elaboration , Is the use MUSIC Algorithm
2D AoA It is estimated that :
Use the L Shape antenna array to derive the azimuth of the incident signal φ \varphi φ Elevation angle θ \theta θ, See the paper for details of the formula 3.3
although 2D AoA Can provide the human body in 2D Approximate location in space , But it cannot distinguish signals reflected from different parts of the human body , For example, signals from the torso ( The signal k 2 k_2 k2 ) Or from the legs ( The signal k 3 k_3 k3 ). This is because of commodities WiFi Hardware limitations of lead to 2D AoA The resolution of the spectrum is very low . To overcome this limitation , We further combine the spatial diversity of the transmitter (2D AoA,AoD) and WiFi OFDM Frequency diversity of subcarriers (ToF) To improve the 2D AoA Spectral resolutionThe spatial diversity in the three transmit antennas will be affected by the deviation angle (AoD) And introduce phase shift , and OFDM Frequency diversity of subcarriers will result in relative time of flight (ToF) Phase shift of . therefore , We can use spatial and frequency diversity to jointly estimate 2D AoA、AoD and ToF, So as to significantly improve 2D AoA Spectral resolution :
a ′ ( φ , θ , τ ) = [ 1 , … , Ω τ V − 1 , Φ ( φ , θ ) , … , Ω τ V − 1 Φ ( φ , θ ) , … , Φ ( φ , θ ) R − 1 , … , Ω τ V − 1 Φ ( φ , θ ) R − 1 ] T a ( φ , θ , ω , τ ) = [ a ( φ , θ , τ ) , Γ ω a ( φ , θ , τ ) ′ , … , Γ ω S − 1 a ( φ , θ , τ ) ] T \begin{aligned} \mathbf{a}^{\prime}(\varphi, \theta, \tau)=& {\left[1, \ldots, \Omega_{\tau}^{V-1}, \Phi_{(\varphi, \theta)}, \ldots, \Omega_{\tau}^{V-1} \Phi_{(\varphi, \theta)}, \ldots, \Phi_{(\varphi, \theta)}^{R-1}, \ldots, \Omega_{\tau}^{V-1} \Phi_{(\varphi, \theta)}^{R-1}\right]^{T} } \\ & \mathbf{a}(\varphi, \theta, \omega, \tau)=\left[\mathbf{a}_{(\varphi, \theta, \tau)}, \Gamma_{\omega} \mathbf{a}_{(\varphi, \theta, \tau)}^{\prime}, \ldots, \Gamma_{\omega}^{S-1} \mathbf{a}_{(\varphi, \theta, \tau)}\right]^{T} \end{aligned} a′(φ,θ,τ)=[1,…,ΩτV−1,Φ(φ,θ),…,ΩτV−1Φ(φ,θ),…,Φ(φ,θ)R−1,…,ΩτV−1Φ(φ,θ)R−1]Ta(φ,θ,ω,τ)=[a(φ,θ,τ),Γωa(φ,θ,τ)′,…,ΓωS−1a(φ,θ,τ)]T P ( φ , θ , ω , τ ) Improve = 1 a H ( φ , θ , ω , τ ) E N E N H a ( φ , θ , ω , τ ) P(\varphi, \theta, \omega, \tau)_{\text {Improve }}=\frac{1}{\mathbf{a}^{H}(\varphi, \theta, \omega, \tau) \mathbf{E}_{N} \mathbf{E}_{N}^{H} \mathbf{a}(\varphi, \theta, \omega, \tau)} P(φ,θ,ω,τ)Improve =aH(φ,θ,ω,τ)ENENHa(φ,θ,ω,τ)1
azimuth φ \varphi φ、 Elevation θ \theta θ、AoD ω \omega ω、ToF τ \tau τStatic environment removal
because 2D AoA Spectrum provides spatial information of multipath signals , We can use this information to remove LoS Signals and signals reflected from static environments , In order to carry out environment independent 3D Attitude estimation . The way to do it is , Human activities 2D AoA Spectrum minus static environment 2D AoA spectrum .

Combine multiple packets :
From a single WiFi Package exported 2D AoA The spectrum can only capture a small part of body motion , So a series of packets (100 A packet ) As the input of neural network to estimate human posture :

neural network
Set the range of azimuth and elevation to [0, 180] degree , A resolution of 1 degree , The obtained size is 180×180 The spectrum of . System utilization 4 A receiver Capture users' actions from different angles , Connect the spectrum of the four receivers , The obtained size is 180 × 180 × 4 Tensor . In addition, we need to combine multiple spectra to capture whole-body motion . therefore , We'll take each receiver's 100 Connect packets , To form a 180 × 180 × 400 Matrix as input
neural network ,CNN It is used to capture the spatial features of human parts , and LSTM Used to estimate the temporal characteristics of motion
Loss function :
L P = 1 T ∑ t = 1 T 1 N ∑ i = 1 N ∥ p ˉ t i − p t i ∥ 2 , L_{P}=\frac{1}{T} \sum_{t=1}^{T} \frac{1}{N} \sum_{i=1}^{N}\left\|\bar{p}_{t}^{i}-p_{t}^{i}\right\|_{2}, LP=T1t=1∑TN1i=1∑N∥∥pˉti−pti∥∥2, L H = 1 T ∑ t = 1 T 1 N ∑ i = 1 N ∥ p ˉ t i − p t i ∥ H , L_{H}=\frac{1}{T} \sum_{t=1}^{T} \frac{1}{N} \sum_{i=1}^{N}\left\|\bar{p}_{t}^{i}-p_{t}^{i}\right\|_{H}, LH=T1t=1∑TN1i=1∑N∥∥pˉti−pti∥∥H, L = Q P ⋅ L P + Q H ⋅ L H , L=Q_{P} \cdot L_{P}+Q_{H} \cdot L_{H}, L=QP⋅LP+QH⋅LH,
reason : What experiments are used to verify their working results
Experimental configuration
One engine and four receivers , Transmitter 3 The antenna , The receiver 3 The antenna (L Shape placement )
Contract awarding rate 1000Hz
Kinect2.0 Record ground truth( Can you record absolute posture ??)
10 Personal dataThe experimental site
A living room (4 × 4)、 The restaurant (3.6 × 3.6) And the bedroom (4 × 3.8)
Transceiver default distance 2.5 rice
Evaluation indicators
The joint positioning error is used as the evaluation index , Defined as the Euclidean distance between the predicted joint position and the ground reality . Please note that , assessment 14 A key point / The joints ( Whether it is aligned or not ?)
Overall performance
① NLOS Conditions : Prove that the system can be used in LoS The deep learning model of training under conditions is applied to NLoS scene , Without retraining
② The impact of environmental change : Used in an environment ( Such as living room or dining room ) To train the system , Then evaluate the system in different environments ( For example, the bedroom ) Performance of runtime in
③ Effect of distance between transceivers
④ The contracting rate affects
⑤ Different users :7 Human training ,1 People verify ,2 Human test
⑥ Multi user impact : Confirmatory experiments are accepted 2 Personal data , But it's no use
My own opinion
- need 4 Receiver , That's too much
- Is this an absolute attitude estimation ? It should be based on the root node
reference
[1] Towards 3D human pose construction using wifi
[2] Winect: 3D Human Pose Tracking for Free-form Activity Using Commodity WiFi
边栏推荐
- profile环境切换
- asp. Net core, C # summary about path
- Convolutional Neural Networks in TensorFlow quizs on Coursera
- [Tkinter] common components (II)
- Introduction to VIM
- Tclsh array operation
- JVM方法调用
- The problem that files cannot be uploaded to the server using TFTP is solved
- MySQL sort. Sort by field value
- core dump
猜你喜欢

【历史上的今天】7 月 24 日:Caldera 诉微软案;AMD 宣布收购 ATI;谷歌推出 Chromecast

PWN learning

In the spring of domestic databases

asp. Net coree file upload and download example

Network security port 80 - PHP CGI parameter injection Execution Vulnerability

卷积神经网络感受野计算指南

OpenGL learning (II) opengl rendering pipeline

MySQL1
![BUUCTF-pwn[1]](/img/93/6b9fe53b31e0c846b8c2ec7ab793ce.png)
BUUCTF-pwn[1]

OPENGL学习(二)OPENGL渲染管线
随机推荐
Crazy God redis notes 11
High speed ASIC packaging trends: integration, SKU and 25g+
mysql排序.按字段值排序
2022杭电多校第二场1009 ShuanQ(数学)
Mysql数据库,去重,连接篇
Profile environment switching
Go Xiaobai implements a simple go mock server
引发0xC0000005内存违例几种可能原因分析
In the spring of domestic databases
2022 Hangzhou Electric Multi school first Dragon Slayer (dfs+ state compression)
Analysis of several possible causes of 0xc0000005 memory violation
Converter
深度学习中Dropout原理解析
[wechat applet development] custom tabbar case (custom message 99 + little hearts)
Sqoop
Cesium uses czml to implement dynamic routes
MySQL sort. Sort by field value
Detailed explanation of the relationship between MySQL tables
Add column by column selection for JTable
多线程与并发编程常见问题(未完待续)