io

deepdataspace.io

This module defines the common dataset IO interfaces.

importer

deepdataspace.io.importer

The common interface of importing a dataset.

class ImportHelper[source]

Bases: object

A mixin class that adds helper functions to import a dataset.

static format_image_data(uri: str, thumb_uri: Optional[str] = None, width: Optional[int] = None, height: Optional[int] = None, id_: Optional[int] = None, metadata: Optional[dict] = None, flag: int = 0, flag_ts: int = 0)[source]

A helper function to format image data.

static format_annotation(category: str, label: str = LabelName.GroundTruth, label_type: str = 'GT', conf: float = 1.0, is_group: bool = False, bbox: Optional[Tuple[int, int, int, int]] = None, segmentation: Optional[List[List[int]]] = None, alpha_uri: Optional[str] = None, keypoints: Optional[List[Union[float, int]]] = None, keypoint_colors: Optional[List[int]] = None, keypoint_skeleton: Optional[List[int]] = None, keypoint_names: Optional[List[str]] = None, caption: Optional[str] = None, confirm_type: int = 0)[source]

A helper function to format annotation data.

class Importer(name: str, id_: Optional[str] = None)[source]

Bases: ImportHelper, ABC

The importer interface. Any subclass of Importer should implement the following methods:

  • __init__: do the initialization works.

  • __iter__: yield a tuple of image and annotation list in every iteration.

And the following methods are optional:
  • pre_run: a hook before the importing process.

  • post_run: a hook after the importing process.

pre_run()[source]

A pre-run hook for subclass importers to prepare data.

post_run()[source]

A post-run hook for subclass importers to clean up data.

on_error(err: Exception)[source]

A hook to handle error.

load_existing_user_data()[source]

load existing user added data from mongodb, so they are not lost when re-importing the database.

add_user_data(image)[source]

Save manually added user data back.

run_import()[source]

The main process of importing the dataset. This Iterates over the dataset and import every image and annotations.

run()[source]

The start point of the importing process.

class FileImporter(path: str, name: Optional[str] = None, id_: Optional[str] = None, enforce: bool = False)[source]

Bases: Importer, ABC

The importer interface for file-based dataset. In addition to abstract methods defined in base Importer class, any subclass of FileImporter should implement the following methods:

  • can_import: a static method, check if the given path can be imported by this importer.

And these methods are optional:
  • collect_files: collect the files related to this dataset, {file_tag: file_path}.

    By default, this function returns {LabelName.GroundTruth: dataset_file_path}. If there are other related files, such as prediction files, they should be collected here too.

collect_files() dict[source]

Collect the files related to this dataset, {file_tag: file_path}.

abstract static can_import(path: str)[source]

Check if the given path can be imported by this importer.

run()[source]

The start point of the importing process.

classmethod get_subclasses()[source]

Get all subclasses of this class. This is used together with can_import function to choose a proper importer for a given path.

choose_importer_cls(target_path: str) Optional[Type[FileImporter]][source]

Choose the proper importer class for target_path. The right importer is the importer class which returns true on importer_class.can_import(target_path).

Parameters:

target_path – the target path to import, either a dataset or a dataset group.

import_dataset(target_path: str, enforce: bool = False) DataSet[source]

Choose the right auto importer for target path, and run the import task.

Parameters:
  • target_path – the target path to import, either a dataset or a dataset group.

  • enforce – enforce the import task, even though the dataset is imported into mongodb before.