tick.dataset.fetch_tick_dataset

tick.dataset.fetch_tick_dataset(dataset_path, data_home=None, n_features=None, verbose=True)[source]

Fetch dataset from tick_datasets github repository.

Uses cache if this dataset has already been downloaded.

Parameters:

dataset_path : str

Dataset path on tick_datasets github repository. For example “binary/adult/adult.trn.bz2” for adult train dataset

data_home : str, optional, default=None

Specify a download and cache folder for the datasets. If None and not configured with TICK_DATASETS environement variable all tick datasets are stored in ‘~/tick_datasets’ subfolders.

n_features : int, optional, default=None

The number of features to use. If None, it will be inferred. This argument is useful to load several files that are subsets of a bigger sliced dataset: each subset might not have examples of every feature, hence the inferred shape might vary from one slice to another.

verbose : bool, default=True

If True, download progress bar will be printed

Returns :

——- :

output : np.ndarray or dict or tuple

Dataset. Its format will depend on queried dataset.

Examples using tick.dataset.fetch_tick_dataset