CollaborativeCoding.dataloaders.download

Classes

Downloader

Class used to verify availability and potentially download implemented datasets.

Module Contents

class CollaborativeCoding.dataloaders.download.Downloader

Class used to verify availability and potentially download implemented datasets.

Methods

mnist(data_dir: Path) -> tuple[np.ndarray, np.ndarray]

Checks the availability of mnist dataset. If not present downloads it into MNIST folder in data_dir.

svhn(data_dir: Path) -> tuple[np.ndarray, np.ndarray]

Download the SVHN dataset and save it as an HDF5 file to data_dir.

usps(data_dir: Path) -> tuple[np.ndarray, np.ndarray]

Download the USPS dataset and save it as an HDF5 file to data_dir.

Raises

NotImplementedError

If the download method is not implemented for the dataset.

Examples

>>> from pathlib import Path
>>> from CollaborativeCoding import Downloader
>>> dir = Path('tmp')
>>> dir.mkdir(exist_ok=True)
>>> train, test = Downloader().usps(dir)
mnist(data_dir: pathlib.Path) tuple[numpy.ndarray, numpy.ndarray]

Check the availability of mnist dataset. If not present downloads it into MNIST folder in data_dir.

svhn(data_dir: pathlib.Path) tuple[numpy.ndarray, numpy.ndarray]
usps(data_dir: pathlib.Path) tuple[numpy.ndarray, numpy.ndarray]

Download the USPS dataset and save it as an HDF5 file to data_dir/usps.h5.

__extract_usps(src: pathlib.Path, dest: pathlib.Path, mode: str)
static __reporthook(blocknum, blocksize, totalsize)

Use this function to report download progress for the urllib.request.urlretrieve function.

static __check_integrity(filepath, checksum)

Check the integrity of the USPS dataset file.

Args

filepathpathlib.Path

Path to the USPS dataset file.

checksumstr

MD5 checksum of the dataset file.

Returns

bool

True if the checksum of the file matches the expected checksum, False otherwise