- Classification: yi belongs to a finite set of discrete value, where the value is unordered.
- Regression: yi is in a continues metric space, where the value of yi contains relevancy information about the sample.
- Ordinal regression: yi belongs to a finite set of discrete value like classification, but there exist ordering relationship in the value as the regression. It can be interpreted as a ranking problem.
Given a sample set S = (Xi,Yi), the regression problem aims to find the best hypothesis that minimize the risk function R(h). The common expectation of lost function E[L()] is used here as the risk function.
Since only the order matters, the lost function L(y1, y2, y1', y2') gives 1 when the order is incorrect, and gives 0 otherwise. We can then redefine the sample set S' using pair of original variable: S' = {(x1, x2), sign(y1 - y2)}.
When using this expression, the output value is in the set {-1, 0, +1}, which transfer the ordinal regression problem into a classification problem.
On the other hand, one can also use a real number function to express the ordinal hypothesis. For each hypothesis h, we can use a function U that
The U function and the theta threshold can then be solved by maximizing the boundary of the training data. The procedure is then the same as the famous Support Vector Machine, which needs to solve a standard QP-problem.
For any unseen data list, we calculate the pairwise feature differences and multiply by w. Then, we are able to get the pairwise relationships of the data.
Although the algorithm is simple and easy-understanding, the pairwise computation is expensive. Thus, the scalability issue is major concerned for the future work.

沒有留言:
張貼留言