Learning to rank: from pairwise approach to listwise approach,
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li April 2007In this work, it proposes a method called list-wise approach for learning to rank, which is much better than the pairwise approach proposed before.
In pairwise learning,there are several weakness.
First, the objective of learning is formalized as minimizing errors in classification of document pairs, rather than minimizing errors in ranking of documents.
Second, the training process is costly due to pairwise computation for the distance of two instances.
Third, the assumption that pairs of documents are drawn from i.i.d is to strong.
Last, with more document pairs, the number of generated document pairs varies largely from query to query, which will result in training a model biased toward queries.
Proposed Method:
Then, define a ranking function f(X); for each feature vector X, it outputs a score Z . And using this function to our list of feature vectors to get a list of ranking score Z0, Z1, ... Zn. Given a training set, the goal is to find the ranking function that minimize a list-wise loss function L(Y, Z).
And they are trying to find loss function and optimization.
They propose a probabilistic method to calculate the list-wise loss function. Speci?cally they transform both the scores of the documents assigned by a ranking function and the explicit or implicit judgments of the documents given by humans into probability distributions. they can then utilize any metric between probability distributions as the loss function. Considering the uses of two models for the transformation, one is referred to as permutation probability and the other top one probability.
The major contributions of this paper include
(1) proposal of the list-wise approach
(2) formulation of the listwise loss function on the basis of probability models,
(3) development of the ListNet method (which is a model modeling with Neural Network as model and Gradient Descent as algorithm)
(1) proposal of the list-wise approach
(2) formulation of the listwise loss function on the basis of probability models,
(3) development of the ListNet method (which is a model modeling with Neural Network as model and Gradient Descent as algorithm)
(4) empirical veri?cation of the effectiveness of the approach.
沒有留言:
張貼留言