TODO: Travis
Timo Breuer and Philipp Schaer
This is the docker image for our replicated submission to CENTRE@CLEF2019 conforming to the OSIRRC jig for the Open-Source IR Replicability Challenge (OSIRRC 2019) at SIGIR 2019. This image is available on Docker Hub has been tested with the jig at commit ca31987 (6/5/2019).
- Supported test collections:
core17
- Required training collections:
robust04
,robust05
- Supported hooks:
init
,index
,search
Use the commands below to get the runs for WCRobust04 and WCRobust0405 as they were replicated in the course of our participation in CENTRE@CLEF19.
The following jig
command can be used to index the New York Times corpus and prepare training data for WCRobust04:
python run.py prepare \
--repo osirrc2019/irc-centre2019 \
--collections robust04=/path/to/robust04/=trectext \
core17=/path/to/core17/=trectext \
--opts run="wcrobust04"
The argument run
can be customized to run
="wcrobust0405" in order to prepare training data for WCRobust0405.
In this case, the robust05
corpus has to be mounted as an additional volume.
python run.py prepare \
--repo osirrc2019/irc-centre2019 \
--collections robust04=/path/to/robust04/=trectext \
robust05=/path/to/robust05/=trectext \
core17=/path/to/core17/=trectext \
--opts run="wcrobust0405"
The following jig
command can be used to perform a retrieval run on the New York Times depending on the previously defined training corpora.
python run.py search \
--repo osirrc2019/irc-centre2019 \
--collection core17 \
--topic topics/topics.core17.txt \
--output /path/to/output/ \
--qrels qrels/qrels.core17.txt
TODO: add outcomes
The following is a short summary of what happens in each of the scripts in this repo.
The Dockerfile
installs python3
, copies scripts for corresponding hooks and makes required directory. The working directory is set to /work/
The init
script will download the code from a repository and installs required Python packages from the requirements.txt file. Depending on the specified run, scripts for WCRobust04 or WCRobust0405 will be prepared.
The index
script runs a subprocess which starts indexing.
The search
script will start the ranking depending on the previously specified run.