tick.preprocessing.LongitudinalSamplesFilter

class tick.preprocessing.LongitudinalSamplesFilter(n_jobs=-1)[source]

Longitudinal data preprocessor which filters out samples for which all labels are null over the entire observation period.

Parameters:

n_jobs : int, default=-1

Number of tasks to run in parallel. If set to -1, the number of tasks is set to the number of cores.

Examples

>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> from tick.preprocessing import LongitudinalSamplesFilter
>>> features = [csr_matrix([[0, 1, 0],
...                         [0, 0, 0],
...                         [0, 0, 1]], dtype="float64"),
...             csr_matrix([[1, 1, 0],
...                         [0, 0, 1],
...                         [0, 0, 0]], dtype="float64")
...             ]
>>> censoring = np.array([3, 2], dtype="uint64")
>>> labels = [np.array([0, 1, 0], dtype="uint64"), np.zeros(3, dtype="uint64")]
>>> n_lags = np.array([2, 1, 0], dtype='uint64')
>>> lfl = LongitudinalSamplesFilter()
>>> features, labels, censoring = lfl.fit_transform(features, labels, censoring)
>>> # output comes as a list of sparse matrices or 2D numpy arrays
>>> features.__class__
<class 'list'>
>>> [x.toarray() for x in features]
[array([[0., 1., 0.],
        [0., 0., 0.],
        [0., 0., 1.]]), array([[1., 1., 0.],
                               [0., 0., 1.],
                               [0., 0., 0.]])]
>>> labels
[array([0, 1, 0], dtype=uint64), array([0, 0, 0], dtype=uint64)]
>>> censoring
array([3, 2], dtype=uint64)