Ph.D. Dissertation Defense "Learning to Rank Algorithms and their Application in Machine Translation" by Tian Xia

Tuesday, November 24, 2015, 9 am to 11 am
Campus: 
Dayton
399 Joshi
Audience: 
Current Students
Faculty

ABSTRACT:

In this thesis, we discuss two issues in the learning to rank area, choosing effective objective loss function, constructing effective regresstion trees in the gradient boosting framework, as well as a third issus, applying learning to rank models into statistcal machine translation.

First, list-wise based learning to rank methods either directly optimize performance measures or optimize surrogate functions of performance measures that have smaller gaps between optimized losses and performance measures, thus it is generally believed that they should be able to lead to better performance than point- and pair-wise based learning to rank methods. However, in real-world applications, state-of-the-art practical learning to rank systems, such as MART and LambdaMART, are not from list-wise based camp. One cause may be that several list-wise based methods work well in the popular but very small LETOR datasets but fail in real-world datasets that are often used for training practical systems.

We propose a list-wise learning to rank method that is based on a list-wise surrogate function, the Plackett-Luce (PL) model. The PL model has a convex loss to ensure a global optimal guarantee, and is proven to be consistent to certain performance measures such as NDCG score. When we conduct experiments on the PL model, we observe that it is actually unstable in performance; when the data has rich enough features, it gives very good results, but for data with scarce features, it fails horribly. For example, when we apply the PL with a linear model on the Microsoft 30K dataset, it gives 7.6 points worse NDCG@1 score than an average performance of several linear systems. This motivates us to propose our new ranking system, PLRank, that is suitable for any data sets through a mapping from feature space into tree space to gain more expressive power. PLRank is trained based on the gradient boosting framework, and it is simple to implement. It has the same time complexity as the LambdaMART, and runs a little bit faster in practice. Moreover, we extend three other list-wise surrogate functions in a gradient boosting framework for a fair and full comparison, and we find that the PL model has special advantages.

Second, industry-level applications of learning to rank models have been dominated by gradient boosting framework, which fits a tree using least square error (SE) principle. Another tree fitting principle, (robust) weighted least square error ((R)WSE), has been widely used in classification, such as LogitBoost and its variants, but hasn’t been reformulated to fulfill learning the rank tasks. For both principles, there is a lack of deep analysis on their relationship in the scenario of learning to rank. Motivated by AdaBoost, we propose a new principle named least objective loss based error (OLE) that enables us to analyze several important learning to rank systems: we prove that (R)WSE is actually a special case of OLE for derivative additive loss functions; OLE, (R)WSE and SE are equivalent for MART system. Under the guidance of OLE principle, we implement three typical and strong systems and conduct our experiments in two real-world datasets. Experimental results show that our proposed OLE principle improves most results over SE.

Third, Margin infused relaxed algorithms (MIRAs) dominate model tuning in statistical machine translation in the case of large scale features, but also they are famous for the complexity in implementation. We introduce a new method, which regards an N-best list as a permutation and minimizes the Plackett-Luce loss of ground-truth permutations. Experiments with large-scale features demonstrate that, the new method is more robust than MERT ; though it is only matchable with MIRAs, it has a comparatively advantage, easier to implement

Ph.D. Committee: Drs. Shaojun Wang (Advisor), Keke Chen, Michael Raymer, and Xinhui Zhang, BIE

For information, contact
Log in to submit a correction for this event (subject to moderation).