Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Fusion vs. Two-Stage for Multimodal Retrieval Avi Arampatzis, Konstantinos Zagoris, and Savvas A. Chatzichristofis
 

Summary: Fusion vs. Two-Stage for Multimodal Retrieval
Avi Arampatzis, Konstantinos Zagoris, and Savvas A. Chatzichristofis
Department of Electrical and Computer Engineering,
Democritus University of Thrace, Xanthi 67100, Greece
{avi,kzagoris,schatzic}@ee.duth.gr
Abstract. We compare two methods for retrieval from multimodal collections.
The first is a score-based fusion of results, retrieved visually and textually. The
second is a two-stage method that visually re-ranks the top-K results textually
retrieved. We discuss their underlying hypotheses and practical limitations, and
contact a comparative evaluation on a standardized snapshot of Wikipedia. Both
methods are found to be significantly more effective than single-modality base-
lines, with no clear winner but with different robustness features. Nevertheless,
two-stage retrieval provides efficiency benefits over fusion.
1 Introduction
Nowadays, information collections are not only large, but they may also be multimodal.
Take as an example Wikipedia, where a single topic may be covered in several lan-
guages and include non-textual media such as image, sound, and video. Moreover, non-
textual media may in turn be annotated.
We focus on two modalities, text and image. On the one hand, textual descriptions
are key to retrieving relevant results for a topic, but at the same time provide little

  

Source: Arampatzis, Avi - Department of Electrical and Computer Engineering, Democritus University of Thrace

 

Collections: Computer Technologies and Information Sciences