          ID: 520387.0, MPI für Informatik / Databases and Information Systems Group

ID: 520387.0, MPI für Informatik / Databases and Information Systems Group
Overview of the INEX 2009 Ad Hoc Track
Authors:Geva, Shlomo; Kamps, Jaap; Lehtonen, Miro; Schenkel, Ralf; Thom, James A.; Trotman, Andrew
Publisher:IR Publishers
Place of Publication:Amsterdam, The Netherlands
Date of Publication (YYYY-MM-DD):2009
Title of Proceedings:Preproceedings of the 2009 INEX Workshop
Start Page:16
End Page:50
Place of Conference/Meeting:Woodlands of Marburg, Australia
(Start) Date of Conference/Meeting
End Date of Conference/Meeting 
Audience:Experts Only
Intended Educational Use:No
Abstract / Description:This paper gives an overview of the INEX 2009 Ad Hoc Track. The main goals of
the Ad Hoc Track were three-fold. The first goal was to investigate the impact
of the collection scale and markup, by using a new collection that is again
based on a the Wikipedia but is over 4 times larger, with longer articles and
additional semantic annotations. For this reason the Ad Hoc track tasks stayed
unchanged, and the Thorough Task of INEX 2002–2006 returns. The second goal was
to study the impact of more verbose queries on retrieval effectiveness, by
using the available markup as structural constraints—now using both the
Wikipedia’s layout-based markup, as well as the enriched semantic
markup—and by the use of phrases. The third goal was to compare different
result granularities by allowing systems to retrieve XML elements, ranges of
XML elements, or arbitrary passages of text. This investigates the value of the
internal document structure (as provided by the XML mark-up) for retrieving
relevant information. The INEX 2009 Ad Hoc Track featured four tasks: For the
Thorough Task a ranked-list of results (elements or passages) by estimated
relevance was needed. For the Focused Task a ranked-list of non-overlapping
results (elements or passages) was needed. For the Relevant in Context Task
non-overlapping results (elements or passages) were returned grouped by the
article from which they came. For the Best in Context Task a single starting
point (element start tag or passage start) for each article was needed. We
discuss the setup of the track, the results for the four tasks, and examine the
relative effectiveness of element and passage retrieval. This is examined
in the context of content only (CO, or Keyword) search as well as
content and structure (CAS, or structured) search. In addition, we look at the
effectiveness of systems using a reference run with a solid article ranking,
and of systems using the phrase query. Finally, we look at the ability of
focused retrieval techniques to rank articles.
Last Change of the Resource (YYYY-MM-DD):2009-12-23
External Publication Status:published
Document Type:Conference-Paper
Communicated by:Gerhard Weikum
Affiliations:MPI für Informatik/Databases and Information Systems Group
