myria3d.pctl

Objects relative to the preprocessing and loading of Lidar data.

myria3d.pctl.datamodule.hdf5

class myria3d.pctl.datamodule.hdf5.HDF5LidarDataModule(data_dir: str, split_csv_path: str, hdf5_file_path: str, epsg: str, points_pre_transform: typing.Optional[typing.Callable[[typing.Union[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[typing.Union[bool, int, float, complex, str, bytes]]]], torch_geometric.data.data.Data]] = None, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap_train: numbers.Number = 0, subtile_overlap_predict: numbers.Number = 0, batch_size: int = 12, num_workers: int = 1, prefetch_factor: int = 2, transforms: typing.Optional[typing.Dict[str, typing.List[typing.Callable]]] = None, **kwargs)[source]

Datamodule to feed train and validation data to the model.

property dataset: myria3d.pctl.dataset.hdf5.HDF5Dataset

Abstraction to ease HDF5 dataset instantiation.

Parameters

las_paths_by_split_dict (LAS_PATHS_BY_SPLIT_DICT_TYPE, optional) – Maps split (val/train/test) to file path. If specified, the hdf5 file is created at dataset initialization time. Otherwise,a precomputed HDF5 file is used directly without I/O to the HDF5 file. This is usefule for multi-GPU training, where data creation is performed in prepare_data method, and the dataset is then loaded again in each GPU in setup method. Defaults to None.

Returns

the dataset with train, val, and test data.

Return type

HDF5Dataset

predict_dataloader()[source]

An iterable or collection of iterables specifying prediction samples.

For more information about multiple dataloaders, see this section.

It’s recommended that all data downloads and preparation happen in prepare_data().

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Returns

A torch.utils.data.DataLoader or a sequence of them specifying prediction samples.

prepare_data(stage: Optional[str] = None)[source]

Prepare dataset containing train, val, test data.

setup(stage: Optional[str] = None) None[source]

Instantiate the (already prepared) dataset (called on all GPUs).

test_dataloader()[source]

An iterable or collection of iterables specifying test samples.

For more information about multiple dataloaders, see this section.

For data processing use the following pattern:

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

Note

If you don’t need a test dataset and a test_step(), you don’t need to implement this method.

train_dataloader()[source]

An iterable or collection of iterables specifying training samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set reload_dataloaders_every_n_epochs to a positive integer.

For data processing use the following pattern:

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

val_dataloader()[source]

An iterable or collection of iterables specifying validation samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set reload_dataloaders_every_n_epochs to a positive integer.

It’s recommended that all data downloads and preparation happen in prepare_data().

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Note

If you don’t need a validation dataset and a validation_step(), you don’t need to implement this method.

myria3d.pctl.dataset.hdf5

class myria3d.pctl.dataset.hdf5.HDF5Dataset(hdf5_file_path: str, epsg: str, las_paths_by_split_dict: typing.Dict[typing.Union[typing.Literal['train'], typing.Literal['val'], typing.Literal['test']], typing.List[str]], points_pre_transform: typing.Callable = <function lidar_hd_pre_transform>, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap_train: numbers.Number = 0, pre_filter=<function pre_filter_below_n_points>, train_transform: typing.Optional[typing.List[typing.Callable]] = None, eval_transform: typing.Optional[typing.List[typing.Callable]] = None)[source]

Single-file HDF5 dataset for collections of large LAS tiles.

property samples_hdf5_paths

Index all samples in the dataset, if not already done before.

myria3d.pctl.dataset.hdf5.create_hdf5(las_paths_by_split_dict: dict, hdf5_file_path: str, epsg: str, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, subtile_overlap_train: numbers.Number = 0, points_pre_transform: typing.Callable = <function lidar_hd_pre_transform>)[source]

Create a HDF5 dataset file from las.

Parameters
  • las_paths_by_split_dict ([LAS_PATHS_BY_SPLIT_DICT_TYPE]) – should look like las_paths_by_split_dict = {‘train’: [‘dir/las1.las’,’dir/las2.las’], ‘val’: […], , ‘test’: […]},

  • hdf5_file_path (str) – path to HDF5 dataset,

  • epsg (str) – epsg to force the reading with

  • tile_width (Number, optional) – width of a LAS tile. 1000 by default,

  • subtile_width – (Number, optional): effective width of a subtile (i.e. receptive field). 50 by default,

  • pre_filter – Function to filter out specific subtiles. “pre_filter_below_n_points” by default,

  • subtile_overlap_train (Number, optional) – Overlap for data augmentation of train set. 0 by default,

  • points_pre_transform (Callable) – Function to turn pdal points into a pyg Data object.

myria3d.pctl.dataset.iterable

class myria3d.pctl.dataset.iterable.InferenceDataset(las_file: str, epsg: str, points_pre_transform: typing.Callable[[typing.Union[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[typing.Union[bool, int, float, complex, str, bytes]]]], torch_geometric.data.data.Data] = <function lidar_hd_pre_transform>, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, transform: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], torch_geometric.data.data.Data]] = None, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap: numbers.Number = 0)[source]

Iterable dataset to load samples from a single las file.

get_iterator()[source]

Yield subtiles from all tiles in an exhaustive fashion.

myria3d.pctl.dataset.toy_dataset

Generation of a toy dataset for testing purposes.

myria3d.pctl.dataset.toy_dataset.make_toy_dataset_from_test_file()[source]

Prepare a toy dataset from a single, small LAS file.

The file is first duplicated to get 2 LAS in each split (train/val/test), and then each file is splitted into .data files, resulting in a training-ready dataset loacted in td_prepared

Parameters
  • src_las_path (str) – input, small LAS file to generate toy dataset from

  • split_csv (str) – Path to csv with a basename (e.g. ‘123_456.las’) and

  • `split` (a) –

  • prepared_data_dir (str) – where to copy files (raw subfolder) and to prepare

  • files (dataset) –

Returns

path to directory containing prepared dataset.

Return type

str

myria3d.pctl.dataset.utils

myria3d.pctl.dataset.utils.find_file_in_dir(data_dir: str, basename: str) str[source]

Query files matching a basename in input_data_dir and its subdirectories. :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.find_file_in_dir.input_data_dir: data directory :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.find_file_in_dir.input_data_dir: str

Returns

first file path matching the query.

Return type

[str]

myria3d.pctl.dataset.utils.get_metadata(las_path: str) dict[source]

returns metadata contained in a las file :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_metadata.las_path: input LAS path to get metadata from. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_metadata.las_path: str

Returns

the metadata.

Return type

dict

myria3d.pctl.dataset.utils.get_pdal_info_metadata(las_path: str) Dict[source]

Read las metadata using pdal info :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_info_metadata.las_path: input LAS path to read. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_info_metadata.las_path: str

Returns

dictionary containing metadata from the las file

Return type

(dict)

myria3d.pctl.dataset.utils.get_pdal_reader(las_path: str, epsg: str) pdal.pipeline.Reader.las[source]

Standard Reader. :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.las_path: input LAS path to read. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.las_path: str :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.epsg: epsg to force the reading with :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.epsg: str

Returns

reader to use in a pipeline.

Return type

pdal.Reader.las

myria3d.pctl.dataset.utils.pdal_read_las_array(las_path: str, epsg: str)[source]

Read LAS as a named array.

Parameters
  • las_path (str) – input LAS path

  • epsg (str) – epsg to force the reading with

Returns

named array with all LAS dimensions, including extra ones, with dict-like access.

Return type

np.ndarray

myria3d.pctl.dataset.utils.pdal_read_las_array_as_float32(las_path: str, epsg: str)[source]

Read LAS as a a named array, casted to floats.

myria3d.pctl.dataset.utils.split_cloud_into_samples(las_path: str, tile_width: numbers.Number, subtile_width: numbers.Number, epsg: str, subtile_overlap: numbers.Number = 0)[source]

Split LAS point cloud into samples.

Parameters
  • las_path (str) – path to raw LAS file

  • tile_width (Number) – width of input LAS file

  • subtile_width (Number) – width of receptive field.

  • epsg (str) – epsg to force the reading with

  • subtile_overlap (Number, optional) – overlap between adjacent tiles. Defaults to 0.

Yields

_type_ – idx_in_original_cloud, and points of sample in pdal input format casted as floats.

myria3d.pctl.dataloader.dataloader

class myria3d.pctl.dataloader.dataloader.GeometricNoneProofCollater(follow_batch=None, exclude_keys=None)[source]

A Collater that returns None when given empty batch.

class myria3d.pctl.dataloader.dataloader.GeometricNoneProofDataloader(*args, **kwargs)[source]

Torch geometric’s dataloader is a simple torch Dataloader with a different Collater.

This overrides the collater with a NoneProof one that will not fail if some Data is None.

myria3d.pctl.points_pre_transform.lidar_hd

myria3d.pctl.points_pre_transform.lidar_hd.lidar_hd_pre_transform(points)[source]

Turn pdal points into torch-geometric Data object.

Builds a composite (average) color channel on the fly. Calculate NDVI on the fly.

Parameters

las_filepath (str) – path to the LAS file.

Returns

the point cloud formatted for later deep learning training.

Return type

Data

myria3d.pctl.transforms.compose

class myria3d.pctl.transforms.compose.CustomCompose(transforms: List[Callable])[source]

Composes several transforms together. Edited to bypass downstream transforms if None is returned by a transform. :param _sphinx_paramlinks_myria3d.pctl.transforms.compose.CustomCompose.transforms: List of transforms to compose. :type _sphinx_paramlinks_myria3d.pctl.transforms.compose.CustomCompose.transforms: List[Callable]

myria3d.pctl.transforms.transforms

class myria3d.pctl.transforms.transforms.CopyFullPos[source]

Make a copy of the original positions - to be used for test and inference.

class myria3d.pctl.transforms.transforms.CopyFullPreparedTargets[source]

Make a copy of all, prepared targets - to be used for test.

class myria3d.pctl.transforms.transforms.CopySampledPos[source]

Make a copy of the unormalized positions of subsampled points - to be used for test and inference.

class myria3d.pctl.transforms.transforms.DropPointsByClass[source]

Drop points with class -1 (i.e. artefacts that would have been mapped to code -1)

class myria3d.pctl.transforms.transforms.MaximumNumNodes(num: int)[source]
class myria3d.pctl.transforms.transforms.MinimumNumNodes(num: int)[source]
class myria3d.pctl.transforms.transforms.NormalizePos(subtile_width=50)[source]

Normalizes xy in [-1;1] range by scaling the whole point cloud (including z dim). XY are expected to be centered on zéro.

class myria3d.pctl.transforms.transforms.NullifyLowestZ[source]

Center on x and y axis only. Set lowest z to 0.

class myria3d.pctl.transforms.transforms.StandardizeRGBAndIntensity[source]

Standardize RGB and log(Intensity) features.

standardize_channel(channel_data: torch.Tensor, clamp_sigma: int = 3)[source]

Sample-wise standardization y* = (y-y_mean)/y_std. clamping to ignore large values.

class myria3d.pctl.transforms.transforms.TargetTransform(classification_preprocessing_dict: Dict[int, int], classification_dict: Dict[int, str])[source]

Make target vector based on input classification dictionnary.

Example: Source : y = [6,6,17,9,1] Pre-processed: - classification_preprocessing_dict = {17:1, 9:1} - y’ = [6,6,1,1,1] Mapped to consecutive integers: - classification_dict = {1:”unclassified”, 6:”building”} - y’’ = [1,1,0,0,0]

class myria3d.pctl.transforms.transforms.ToTensor(keys: List[str] = ['pos', 'x', 'y'])[source]

Turn np.arrays specified by their keys into Tensor.