Self-Concordant Proximal Gradient descent
For the minimization of objectives of the form
where \(f\) is self-concordant and \(g\) is prox-capable.
Function \(f\) corresponds to the model.loss method of the model
(passed with set_model to the solver) and \(g\) corresponds to
the prox.value method of the prox (passed with the set_prox method).
One iteration of SCPG is as follows:
where \(\nabla f(w)\) is the gradient of \(f\) given by the
model.grad method and \(\mathrm{prox}_{g / L_k}\) is given by the
prox.call method. The first step size \(1 / L_k\) can be tuned with
step it will then be updated with Barzilai-Borwein steps and linesearch.
The iterations stop whenever tolerance tol is achieved, or after
max_iter iterations. The obtained solution \(w\) is returned
by the solve method, and is also stored in the solution attribute
of the solver.
step : float, default=None
Step-size parameter that will be used at the first iteration. Then the linesearch algorithm will compute the next steps.
tol : float, default=1e-10
The tolerance of the solver (iterations stop when the stopping criterion is below it)
max_iter : int, default=100
Maximum number of iterations of the solver.
verbose : bool, default=True
If
True, solver verboses history, otherwise nothing is displayed, but history is recorded anyway
print_every : int, default=10
Print history information every time the iteration number is a multiple of
print_every. Used only isverboseis True
record_every : int, default=1
Save history information every time the iteration number is a multiple of
record_every
linesearch_step_increase : float, default=2.
Factor of step increase when using linesearch
linesearch_step_decrease : float, default=0.5
Factor of step decrease when using linesearch
modified : bool, default=False
Enables modified verison. The modified version of this algorithm consists in evaluating \(f\) at \(y_k\) and \(w^{k+1}\) and keeping the iterate that minimizes the best \(f\).
dtype : {'float64', 'float32'}, default=’float64’
Type of the arrays used. This value is set from model and prox dtypes.
model : Model
The model used by the solver, passed with the
set_modelmethod
prox : Prox
Proximal operator used by the solver, passed with the
set_proxmethod
solution : numpy.array, shape=(n_coeffs,)
Minimizer found by the solver
history : dict-like
A dict-type of object that contains history of the solver along iterations. It should be accessed using the
get_historymethod
time_start : str
Start date of the call to
solve()
time_elapsed : float
Duration of the call to
solve(), in seconds
time_end : str
End date of the call to
solve()
Notes
This algorithm is designed to work properly with self-concordant losses
such as Poisson regression with linear link
ModelPoisReg or Hawkes processes likelihood
ModelHawkesSumExpKernLogLik
References
Tran-Dinh Q, Kyrillidis A, Cevher V. Composite self-concordant minimization. The Journal of Machine Learning Research (2015)