当前位置:网站首页>Ultrafast transformers | redesign vit with res2net idea and dynamic kernel size, surpassing mobilevit

Ultrafast transformers | redesign vit with res2net idea and dynamic kernel size, surpassing mobilevit

2022-06-22 21:00:00 Zhiyuan community

In pursuit of continuously improving accuracy , Large scale network models are usually developed . Such models require a lot of computing resources , Therefore, it cannot be deployed on edge devices . Because edge devices are available in many application fields , Therefore, it is of great value to build a resource efficient general network .

In this work  CNN  and  Transformer  The advantages of the model , A new efficient hybrid architecture is proposed  EdgeNeXt. Especially in  EdgeNeXt  in , Introduced Split Depth-wise Transpose Attention(SDTA) Encoder ,SDTA Split the input tensor into multiple channel groups , And using depth convolution and cross channel dimension Self-Attention To implicitly expand receptive field and encode multi-scale features .

In the classification 、 Extensive experiments on detection and segmentation tasks reveal the advantages of the proposed method ,EdgeNeXt It is superior to the most advanced method when the calculation requirements are relatively low .1.3M Parametric  EdgeNeXt  Model in  ImageNet-1K  It has been realized. 71.2% Of top-1 Accuracy rate , With 2.2% Gain and 28% Of FLOP Reduced by more than  MobileViT. Besides ,5.6M Parametric  EdgeNeXt  Model in  ImageNet-1K  It has been realized. 79.4% Of top-1 Accuracy rate .

Thesis link :

https://arxiv.org/abs/2206.10589

原网站

版权声明
本文为[Zhiyuan community]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206221930181050.html