当前位置:网站首页>Thesis reading (59):keyword based diverse image retrieval with variable multiple instance graph
Thesis reading (59):keyword based diverse image retrieval with variable multiple instance graph
2022-06-28 11:01:00 【Inge】
List of articles
1 summary
1.1 subject
1.2 background
Cross modal Image Retrieval Has recently attracted extensive research attention . In the real world , Keyword based queries issued by users are usually very short , And has a wide range of semantics . therefore , In this user oriented service , Semantic diversity is as important as retrieval accuracy , To improve the user experience . However , Most cross modal image retrieval methods based on single point query embedding have low semantic diversity , However, due to the lack of cross modal understanding, the accuracy of diversified retrieval methods is low .
1.3 Strategy
An end-to-end Variational multiexample graph (Variational multiple instance graph, VMIG):
1) Learn a continuous semantic space To capture different query semantics ;
2) The retrieval task is formulated as a multi example learning problem , Connecting different features across modes .
In particular , Use query guided Variational self encoder (Variational autoencoder, VAE) To model continuous semantic space , Instead of learning single point embedding . then , By means of Sampling in continuous semantic space And applications Long attention Obtain multiple instances of images and queries respectively . thereafter , Build instance diagram To remove noisy instances and align cross modal semantics . Last , Heterogeneous patterns are fused robustly under multiple losses .
1.4 Bib
@article{
Zeng:2022:110,
author = {
Zeng, Yawen and Wang, Yiru and Liao, Dongliang and Li, Gongfu and Huang, Weijie and Xu, Jin and Cao, Da and Man, Hong},
title = {
Keyword-based diverse image retrieval with variational multiple instance graph},
journal = {
{
IEEE} Transactions on Neural Networks and Learning Systems},
pages = {
1--10},
year = {
2022},
doi = {
10.1109/TNNLS.2022.3168431},
url = {
https://ieeexplore.ieee.org/abstract/document/9764824}
}
2 frame
chart 2 It shows VMIG The overall framework of , It consists of three parts :
1) Semantic feature projection : Extract the features of image and query , And project them into their respective semantic spaces ;
2) Cross model diversity generator ; Learn the one to many semantic distribution to generate multiple instances , And build a multi example diagram of cross model . Multiple instances of images and queries are query oriented VAE And long attention gain , The cross model multi example graph is used to explore the semantic relevance within the schema and cross schema alignment ;
3) Semantic space constraints : Multiple losses are used to constrain the cross modal semantic space .

2.1 Semantic feature projection
Make v v v and t t t Represent images and keyword based queries respectively . Given a t t t, Our goal is Ensure relevance and diversity to retrieve appropriate images . In order to learn better characteristics , use first ResNet Extraction of image features f v \mathbf{f}_v fv, And the use of Doc2Vec Get query characteristics f t \mathbf{f}_t ft. These features are then separated Projection To the semantic space :
{ f ~ v = o v ( f v ) f ~ t = o t ( f t ) (1) \tag{1} \left\{ \begin{array}{l} \tilde{\mathbf{f}}_v&=&o_v(\mathbf{f}_v)\\ \tilde{\mathbf{f}}_t&=&o_t(\mathbf{f}_t) \end{array} \right. { f~vf~t==ov(fv)ot(ft)(1) among o v o_v ov and o t o_t ot It is approximated by a fully connected network Projection function .
2.2 Cross model diversity generator
边栏推荐
- 随机森林以及 AMR 训练出的诗词制造器
- Information hidden in the trend chart of Hong Kong London gold market
- Set up your own website (11)
- How does ETF position affect spot gold price?
- [practice] 1364- implement a perfect waterfall flow component on the mobile terminal (with source code)
- 知道 Redis RDB 这些细节,可以少踩很多坑
- Katalon framework tests a web page operation example code
- MarkDown——基本使用语法
- 【monkey】monkey测试入门
- JS基础4
猜你喜欢

Katalon global variable is referenced in testobject

sentinel

字符串 & 堆 & 方法区

Internet of things application case of wireless module transparent transmission technology

数据库系列:有什么办法对数据库的业务表进行无缝升级

【实战】1364- 实现一个完美的移动端瀑布流组件(附源码)

JS基础1-JS引入与运算符

Dataease installation upgrade

Remote connection of raspberry pie in VNC viewer mode without display

DlhSoft Kanban Library for WPF
随机推荐
时间戳和date转换「建议收藏」
MySQL(二)
阿里三面:LEFT JOIN关联表中用ON还是WHERE跟条件有什么区别
[practice] 1364- implement a perfect waterfall flow component on the mobile terminal (with source code)
【实战】1364- 实现一个完美的移动端瀑布流组件(附源码)
Katalon框架测试一个web页面操作实例代码
MySQL (I)
JS基础5
数据库系列:有什么办法对数据库的业务表进行无缝升级
Solve the problem of reading package listsdonebuilding dependency treereading state informationdone
Redis database
Xshell和Xftp使用教程
Six fusion positioning technologies in wireless communication application of Internet of things
[QT] connect syntax reference implementation
How to distinguish and define DQL, DML, DDL and DCL in SQL
Word、PDF、TXT文件实现全文内容检索需要用什么方法?
工控安全之勒索病毒篇
移动命令
Summary of characteristics of five wireless transmission protocols of Internet of things
Installing MySQL database (CentOS) in Linux source code