Convex Factorization Machine for Regression


We propose the convex factorization machine (CFM), which is a convex variant of the widely used Factorization Machines (FMs). Specifically, we employ a linear+quadratic model and regularize the linear term with the ell_2-regularizer and the quadratic term with the trace norm regularizer. Then, we formulate the CFM optimization as a semidefinite programming problem and propose an efficient optimization procedure with Hazan's algorithm. A key advantage of CFM over existing FMs is that it can find a globally optimal solution, while FMs may get a poor locally optimal solution since the objective function of FMs is non-convex. In addition, the proposed algorithm is simple yet effective and can be implemented easily. Finally, CFM is a general factorization method and can also be used for other factorization problems including multi-view matrix factorization problems. Through synthetic and movielens datasets, we first show that the proposed CFM achieves results competitive to FMs. Furthermore, in a toxicogenomics prediction task, we show that CFM outperforms a state-of-the-art tensor factorization method.

Main Idea

Let us denote the user-item matrix {bf A} in R^{|U| times |I|}. In factorization machine framework (libFM), they formulate recommendation problems as regression problems, where the input {bf x} is a feature vector that indicates the k-th user and the k'-th item, and output y is the rating of the user-item pair:

{bf x}_{i} = [overbrace{0~cdots 0~underbrace{1}_{ktextnormal{-th user}}~0~cdots 0}^{|U|}~overbrace{0 ~cdots 0~underbrace{1}_{k'textnormal{-th item}}~0 cdots 0}^{|I|}]^top in R^{|U| + |I|},
y_i = [{bf A}]_{k,k'}.

Note that, we can easily add user and article meta information such as user gender and article category by concatenating those information to {bf x}.

The optimization problem of the convex factorization machine is given as

min_{{bf w}, {bf W}}~ sum_{i = 1}^{n} (y_i  - f({bf x}_i; {bf w}, {bf W}))^2 + lambda_1 |{bf w}|_2^2~~~ textnormal{s.t.}~ |{bf W}|_{textnormal{tr}} = eta, {bf W} succeq 0,

where |cdot|_{textnormal{tr}} is the trace norm, and

f({bf x};{bf w}, {bf W}) = w_0 + {bf w}_0^top {bf x} + sum_{ell=1}^d sum_{ell'= ell + 1}^d [{bf W}]_{ell,ell'} x_ell x_{ell'},

is the model.


  • Can solve large-scale recommendation task.

  • Can find a globally optimal solution.

  • Can apply to various types of applications including matrix completion, collective matrix completion, multi-view matrix completion.



I am happy to have any kind of feedbacks. E-mail: textnormal{}


  • Yamada, M., Lian, W., Goyal, A., Chen, J., Khan, A. S., Kaski, S., Mamitsuka, H., & Chang, Y.
    Convex Factorization Machine for Regression.