Stochastic gradient descent solver
For the minimization of objectives of the form
where the functions \(f_i\) have smooth gradients and \(g\) is
prox-capable. Function \(f = \frac 1n \sum_{i=1}^n f_i\) corresponds
to the model.loss method of the model (passed with set_model to the
solver) and \(g\) corresponds to the prox.value method of the
prox (passed with the set_prox method).
One iteration of SGD corresponds to the
following iteration applied epoch_size times:
where \(i\) is sampled at random (strategy depends on rand_type) at
each iteration, where \(\eta_t = \eta / (t + 1)\), with
\(\eta > 0\) that can be tuned with step. The seed of the random
number generator for generation of samples \(i\) can be seeded with
seed. The iterations stop whenever tolerance tol is achieved, or
after max_iter epochs (namely max_iter\(\times\) epoch_size
iterations).
The obtained solution \(w\) is returned by the solve method, and is
also stored in the solution attribute of the solver.
step : float
Step-size parameter, the most important parameter of the solver. A try-an-improve approach should be used.
tol : float, default=1e-10
The tolerance of the solver (iterations stop when the stopping criterion is below it)
max_iter : int, default=100
Maximum number of iterations of the solver, namely maximum number of epochs (by default full pass over the data, unless
epoch_sizehas been modified from default)
verbose : bool, default=True
If
True, solver verboses history, otherwise nothing is displayed, but history is recorded anyway
seed : int, default=-1
The seed of the random sampling. If it is negative then a random seed (different at each run) will be chosen.
epoch_size : int, default given by model
Epoch size, namely how many iterations are made before updating the variance reducing term. By default, this is automatically tuned using information from the model object passed through
set_model.
rand_type : {‘unif’, ‘perm’}, default=’unif’
How samples are randomly selected from the data
if
'unif'samples are uniformly drawn among all possibilitiesif
'perm'a random permutation of all possibilities is generated and samples are sequentially taken from it. Once all of them have been taken, a new random permutation is generated
print_every : int, default=10
Print history information every time the iteration number is a multiple of
print_every. Used only isverboseis True
record_every : int, default=1
Save history information every time the iteration number is a multiple of
record_every
model : Model
The model used by the solver, passed with the
set_modelmethod
prox : Prox
Proximal operator used by the solver, passed with the
set_proxmethod
solution : numpy.array, shape=(n_coeffs,)
Minimizer found by the solver
history : dict-like
A dict-type of object that contains history of the solver along iterations. It should be accessed using the
get_historymethod
time_start : str
Start date of the call to
solve()
time_elapsed : float
Duration of the call to
solve(), in seconds
time_end : str
End date of the call to
solve()
dtype : {'float64', 'float32'}, default=’float64’
Type of the arrays used. This value is set from model and prox dtypes.
References
tick.solver.SGD¶