Localized Lasso


The localized Lasso, which is suited for learning models that both are interpretable and have a high predictive power in problems with high dimensionality d and small sample size n. More specifically, we consider a function defined by local sparse models, one at each data point. We introduce sample-wise network regularization to borrow strength across the models, and sample-wise exclusive group sparsity (a.k.a., ell_{1,2} norm) to introduce diversity into the choice of feature sets in the local models. The local models are interpretable in terms of similarity of their sparsity patterns. The cost function is convex, and thus has a globally optimal solution. Moreover, we propose a simple yet efficient iterative least-squares based optimization procedure for the localized Lasso, which does not need a tuning parameter, and is guaranteed to converge to a globally optimal solution.

Main Idea

The localized Lasso is given as the following form

min_{mathbf W} hspace{0.1cm} sum_{i = 1}^n (y_i - {mathbf w}_i^top {mathbf x}_i)^2 + lambda_1 sum_{i,j = 1}^{n} r_{ij} |{mathbf w}_i - {mathbf w}_j|_{2} + lambda_2 sum_{i = 1}^{n}|{mathbf w}_i|_1^2.

where r_{ij} geq 0, r_{ij} = r_{ji}, r_{ii} = 0 is the pre-defined Graph information.


  • Can select nonlinearly related features.

  • Highly scalable w.r.t. the number of features.

  • Convex optimization.



I am happy to have any kind of feedbacks. E-mail: textnormal{makoto.m.yamada@ieee.org}