tick.robust.
ModelHuber
(fit_intercept: bool = True, threshold: float = 1, n_threads: int = 1)[source]¶Huber loss for robust regression. This model is particularly relevant
to deal with datasets with outliers. The class gives first
order information (gradient and loss) for this model and can be passed
to any solver through the solver’s set_model
method.
Given training data \((x_i, y_i) \in \mathbb R^d \times \mathbb R\) for \(i=1, \ldots, n\), this model considers a goodness-of-fit
where \(w \in \mathbb R^d\) is a vector containing the model-weights,
\(b \in \mathbb R\) is the intercept (used only whenever
fit_intercept=True
) and
\(\ell : \mathbb R^2 \rightarrow \mathbb R\) is the loss given by
for \(y, y' \in \mathbb R\), where \(\delta > 0\) can be tuned
using the threshold
argument. Data is passed to this model through the
fit(X, y)
method where X is the features matrix (dense or sparse) and
y is the vector of labels.
fit_intercept : bool
If
True
, the model uses an intercept
threshold : float
, default=1.
Positive threshold of the loss, see above for details.
features : {numpy.ndarray
, scipy.sparse.csr_matrix
}, shape=(n_samples, n_features)
The features matrix, either dense or sparse
labels : numpy.ndarray
, shape=(n_samples,) (read-only)
The labels vector
n_samples : int
(read-only)
Number of samples
n_features : int
(read-only)
Number of features
n_coeffs : int
(read-only)
Total number of coefficients of the model
n_threads : int
, default=1 (read-only)
Number of threads used for parallel computation.
if
int <= 0
: the number of threads available on the CPUotherwise the desired number of threads
__init__
(fit_intercept: bool = True, threshold: float = 1, n_threads: int = 1)[source]¶Initialize self. See help(type(self)) for accurate signature.
fit
(features, labels)[source]¶Set the data into the model object
features : {numpy.ndarray
, scipy.sparse.csr_matrix
}, shape=(n_samples, n_features)
The features matrix, either dense or sparse
labels : numpy.ndarray
, shape=(n_samples,)
The labels vector
output : ModelHuber
The current instance with given data
get_lip_best
() → float¶Returns the best Lipschitz constant, using all samples Warning: this might take some time, since it requires a SVD computation.
output : float
The best Lipschitz constant
get_lip_max
() → float¶Returns the maximum Lipschitz constant of individual losses. This is particularly useful for step-size tuning of some solvers.
output : float
The maximum Lipschitz constant
get_lip_mean
() → float¶Returns the average Lipschitz constant of individual losses. This is particularly useful for step-size tuning of some solvers.
output : float
The average Lipschitz constant
grad
(coeffs: numpy.ndarray, out: numpy.ndarray = None) → numpy.ndarray¶Computes the gradient of the model at coeffs
coeffs : numpy.ndarray
Vector where gradient is computed
out : numpy.ndarray
or None
If
None
a new vector containing the gradient is returned, otherwise, the result is saved inout
and returned
output : numpy.ndarray
The gradient of the model at
coeffs
Notes
The fit
method must be called to give data to the model,
before using grad
. An error is raised otherwise.
loss
(coeffs: numpy.ndarray) → float¶Computes the value of the goodness-of-fit at coeffs
coeffs : numpy.ndarray
The loss is computed at this point
output : float
The value of the loss
Notes
The fit
method must be called to give data to the model,
before using loss
. An error is raised otherwise.
loss_and_grad
(coeffs: numpy.ndarray, out: numpy.ndarray = None) → tuple¶Computes the value and the gradient of the function at
coeffs
coeffs : numpy.ndarray
Vector where the loss and gradient are computed
out : numpy.ndarray
or None
If
None
a new vector containing the gradient is returned, otherwise, the result is saved inout
and returned
loss : float
The value of the loss
grad : numpy.ndarray
The gradient of the model at
coeffs
Notes
The fit
method must be called to give data to the model,
before using loss_and_grad
. An error is raised otherwise.
tick.robust.ModelHuber
¶