Rank-Lips: Learning-to-Rank on Trec_Eval

High-performance multi-threaded list-wise learning-to-rank implementation that supports mini-batched learning. The implementation focuses on coordinate ascent to optimize for mean-average precision, which is one of the strongest models for model combination in information retrieval.

Rank-lips is designed to work with trec_eval file formats for defining runs (run format) and relevance data (qrel format). The features will be taken from the score and/or reciprocal rank of each input file. The filename of an input run (in the directory) will be used as a feature name. If you want to train a model and predict on a different test set, make sure that the input runs for test features are using exactly the sane filename. We recommend to create different directories for training and test sets.

Rank-lips is implemented in GHC Haskell, but we also provide Linux binaries. Rank-lips is released AS-IS under the BSD-3-Clause open source license.

Authors: Laura Dietz and Ben Gamari.

Installation

The latest release is v1.1.

Quick Usage

Training with 5-fold cross-validation

The features are given as a set of run-files (one run = one feature). The filename of the runfile in the $TRAIN_FEATURE_DIR is used as a feature name. When the trained model is used to predict on a different dataset, the directory of test features need to use exactly the same file names. (Hint: you can use a directory of softlinks to define the names.) The ground truth is given as a qrels format.

rank-lips train -d "$TRAIN_FEATURE_DIR" -q "$QREL" -e "$FRIENDLY_NAME" -O "$OUT_DIR" -o "$OUT_PREFIX" -z-score --feature-variant FeatScore --mini-batch-size 1000 --convergence-threshold -0.001 --folds 5 --restarts 5 --threads 5 --train-cv

The $OUT_DIR$ will contain the following files:

Cross-validation:

Trained on whole data set:

To predict with rank-lips trained with previous versions, please include --is-v10-model.

See Also