当前位置:网站首页>Thesis reading (59):keyword based diverse image retrieval with variable multiple instance graph
Thesis reading (59):keyword based diverse image retrieval with variable multiple instance graph
2022-06-28 11:01:00 【Inge】
List of articles
1 summary
1.1 subject
1.2 background
Cross modal Image Retrieval Has recently attracted extensive research attention . In the real world , Keyword based queries issued by users are usually very short , And has a wide range of semantics . therefore , In this user oriented service , Semantic diversity is as important as retrieval accuracy , To improve the user experience . However , Most cross modal image retrieval methods based on single point query embedding have low semantic diversity , However, due to the lack of cross modal understanding, the accuracy of diversified retrieval methods is low .
1.3 Strategy
An end-to-end Variational multiexample graph (Variational multiple instance graph, VMIG):
1) Learn a continuous semantic space To capture different query semantics ;
2) The retrieval task is formulated as a multi example learning problem , Connecting different features across modes .
In particular , Use query guided Variational self encoder (Variational autoencoder, VAE) To model continuous semantic space , Instead of learning single point embedding . then , By means of Sampling in continuous semantic space And applications Long attention Obtain multiple instances of images and queries respectively . thereafter , Build instance diagram To remove noisy instances and align cross modal semantics . Last , Heterogeneous patterns are fused robustly under multiple losses .
1.4 Bib
@article{
Zeng:2022:110,
author = {
Zeng, Yawen and Wang, Yiru and Liao, Dongliang and Li, Gongfu and Huang, Weijie and Xu, Jin and Cao, Da and Man, Hong},
title = {
Keyword-based diverse image retrieval with variational multiple instance graph},
journal = {
{
IEEE} Transactions on Neural Networks and Learning Systems},
pages = {
1--10},
year = {
2022},
doi = {
10.1109/TNNLS.2022.3168431},
url = {
https://ieeexplore.ieee.org/abstract/document/9764824}
}
2 frame
chart 2 It shows VMIG The overall framework of , It consists of three parts :
1) Semantic feature projection : Extract the features of image and query , And project them into their respective semantic spaces ;
2) Cross model diversity generator ; Learn the one to many semantic distribution to generate multiple instances , And build a multi example diagram of cross model . Multiple instances of images and queries are query oriented VAE And long attention gain , The cross model multi example graph is used to explore the semantic relevance within the schema and cross schema alignment ;
3) Semantic space constraints : Multiple losses are used to constrain the cross modal semantic space .
2.1 Semantic feature projection
Make v v v and t t t Represent images and keyword based queries respectively . Given a t t t, Our goal is Ensure relevance and diversity to retrieve appropriate images . In order to learn better characteristics , use first ResNet Extraction of image features f v \mathbf{f}_v fv, And the use of Doc2Vec Get query characteristics f t \mathbf{f}_t ft. These features are then separated Projection To the semantic space :
{ f ~ v = o v ( f v ) f ~ t = o t ( f t ) (1) \tag{1} \left\{ \begin{array}{l} \tilde{\mathbf{f}}_v&=&o_v(\mathbf{f}_v)\\ \tilde{\mathbf{f}}_t&=&o_t(\mathbf{f}_t) \end{array} \right. { f~vf~t==ov(fv)ot(ft)(1) among o v o_v ov and o t o_t ot It is approximated by a fully connected network Projection function .
2.2 Cross model diversity generator
边栏推荐
- 动态库(共享库)的制作和使用
- 方法重写(Override)
- JS foundation 1-js introduction and operator
- 还在用 SimpleDateFormat 做时间格式化?小心项目崩掉!
- SQL中的DQL、DML、DDL和DCL是怎么区分和定义的
- 【功能建议】多个工作空间启动时选择某个空间
- Redis6 一:Nosql引入、Redis可以解决什么问题?
- This Exception was thrown from a job compiled with Burst, which has limited exception support. 报错
- JS基础8
- 利用soapUI获取freemarker的ftl文件模板
猜你喜欢
树莓派无需显示屏的VNC Viewer方式的远程连接
JS foundation 3
Katalon framework tests web (XX) custom keywords and upload pop-up operations
Summary of characteristics of five wireless transmission protocols of Internet of things
Information hidden in the trend chart of Hong Kong London gold market
[practice] 1364- implement a perfect waterfall flow component on the mobile terminal (with source code)
线程和线程池
【实战】1364- 实现一个完美的移动端瀑布流组件(附源码)
字符串 & 堆 & 方法区
【实操】Appium Settings app is not running after 5000ms
随机推荐
Six fusion positioning technologies in wireless communication application of Internet of things
An idea plug-in that automatically generates unit tests, which improves the development efficiency by more than 70%!
Installing MySQL database (CentOS) in Linux source code
Hystrix deployment
Katalon框架测试web(二十)自定义关键字以及上传弹窗操作
使用API快捷创建ECS
sentinel
爱可可AI前沿推介(6.28)
Yann LeCun新论文:构建自动智能体之路
JS foundation 3
DlhSoft Kanban Library for WPF
一种跳板机的实现思路
Katalon框架测试一个web页面操作实例代码
MySQL (I)
【实操】Appium Settings app is not running after 5000ms
Metersphere实现UI自动化元素不可点击(部分遮挡)
soapui的菜鸟教程
线程和线程池
Wireshark数据抓包分析之FTP协议
[monkey] Introduction to monkey test