myria3d.pctl

Objects relative to the preprocessing and loading of Lidar data.

myria3d.pctl.datamodule.hdf5

class myria3d.pctl.datamodule.hdf5.HDF5LidarDataModule(data_dir: str, split_csv_path: str, hdf5_file_path: str, epsg: str, points_pre_transform: typing.Optional[typing.Callable[[typing.Union[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[typing.Union[bool, int, float, complex, str, bytes]]]], torch_geometric.data.data.Data]] = None, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap_train: numbers.Number = 0, subtile_overlap_predict: numbers.Number = 0, batch_size: int = 12, num_workers: int = 1, prefetch_factor: int = 2, transforms: typing.Optional[typing.Dict[str, typing.List[typing.Callable]]] = None, **kwargs)[source]

Datamodule to feed train and validation data to the model.

property dataset: myria3d.pctl.dataset.hdf5.HDF5Dataset

Abstraction to ease HDF5 dataset instantiation.

Parameters: las_paths_by_split_dict (LAS_PATHS_BY_SPLIT_DICT_TYPE, optional) – Maps split (val/train/test) to file path. If specified, the hdf5 file is created at dataset initialization time. Otherwise,a precomputed HDF5 file is used directly without I/O to the HDF5 file. This is usefule for multi-GPU training, where data creation is performed in prepare_data method, and the dataset is then loaded again in each GPU in setup method. Defaults to None.
Returns: the dataset with train, val, and test data.
Return type: HDF5Dataset

predict_dataloader()[source]

An iterable or collection of iterables specifying prediction samples.

For more information about multiple dataloaders, see this section.

It’s recommended that all data downloads and preparation happen in prepare_data().

predict()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Returns: A torch.utils.data.DataLoader or a sequence of them specifying prediction samples.

prepare_data(stage: Optional[str] = None)[source]: Prepare dataset containing train, val, test data.

setup(stage: Optional[str] = None) → None[source]: Instantiate the (already prepared) dataset (called on all GPUs).

test_dataloader()[source]

An iterable or collection of iterables specifying test samples.

For more information about multiple dataloaders, see this section.

For data processing use the following pattern:

download in prepare_data()

process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

test()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

Note

If you don’t need a test dataset and a test_step(), you don’t need to implement this method.

train_dataloader()[source]

An iterable or collection of iterables specifying training samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set reload_dataloaders_every_n_epochs to a positive integer.

For data processing use the following pattern:

download in prepare_data()

process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

fit()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

val_dataloader()[source]

An iterable or collection of iterables specifying validation samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set reload_dataloaders_every_n_epochs to a positive integer.

It’s recommended that all data downloads and preparation happen in prepare_data().

fit()
validate()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Note

If you don’t need a validation dataset and a validation_step(), you don’t need to implement this method.

myria3d.pctl.dataset.hdf5

class myria3d.pctl.dataset.hdf5.HDF5Dataset(hdf5_file_path: str, epsg: str, las_paths_by_split_dict: typing.Dict[typing.Union[typing.Literal['train'], typing.Literal['val'], typing.Literal['test']], typing.List[str]], points_pre_transform: typing.Callable = <function lidar_hd_pre_transform>, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap_train: numbers.Number = 0, pre_filter=<function pre_filter_below_n_points>, train_transform: typing.Optional[typing.List[typing.Callable]] = None, eval_transform: typing.Optional[typing.List[typing.Callable]] = None)[source]

Single-file HDF5 dataset for collections of large LAS tiles.

property samples_hdf5_paths: Index all samples in the dataset, if not already done before.

myria3d.pctl.dataset.hdf5.create_hdf5(las_paths_by_split_dict: dict, hdf5_file_path: str, epsg: str, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, subtile_overlap_train: numbers.Number = 0, points_pre_transform: typing.Callable = <function lidar_hd_pre_transform>)[source]

Create a HDF5 dataset file from las.

Parameters

las_paths_by_split_dict¶ ([LAS_PATHS_BY_SPLIT_DICT_TYPE]) – should look like las_paths_by_split_dict = {‘train’: [‘dir/las1.las’,’dir/las2.las’], ‘val’: […], , ‘test’: […]},
hdf5_file_path¶ (str) – path to HDF5 dataset,
epsg¶ (str) – epsg to force the reading with
tile_width¶ (Number, optional) – width of a LAS tile. 1000 by default,
subtile_width¶ – (Number, optional): effective width of a subtile (i.e. receptive field). 50 by default,
pre_filter¶ – Function to filter out specific subtiles. “pre_filter_below_n_points” by default,
subtile_overlap_train¶ (Number, optional) – Overlap for data augmentation of train set. 0 by default,
points_pre_transform¶ (Callable) – Function to turn pdal points into a pyg Data object.

myria3d.pctl.dataset.iterable

class myria3d.pctl.dataset.iterable.InferenceDataset(las_file: str, epsg: str, points_pre_transform: typing.Callable[[typing.Union[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[typing.Union[bool, int, float, complex, str, bytes]]]], torch_geometric.data.data.Data] = <function lidar_hd_pre_transform>, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, transform: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], torch_geometric.data.data.Data]] = None, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap: numbers.Number = 0)[source]

Iterable dataset to load samples from a single las file.

get_iterator()[source]: Yield subtiles from all tiles in an exhaustive fashion.

myria3d.pctl.dataset.toy_dataset

Generation of a toy dataset for testing purposes.

myria3d.pctl.dataset.toy_dataset.make_toy_dataset_from_test_file()[source]

Prepare a toy dataset from a single, small LAS file.

The file is first duplicated to get 2 LAS in each split (train/val/test), and then each file is splitted into .data files, resulting in a training-ready dataset loacted in td_prepared

Parameters

src_las_path¶ (str) – input, small LAS file to generate toy dataset from
split_csv¶ (str) – Path to csv with a basename (e.g. ‘123_456.las’) and
`split`¶ (a) –
prepared_data_dir¶ (str) – where to copy files (raw subfolder) and to prepare
files¶ (dataset) –

Returns

path to directory containing prepared dataset.

Return type

str

myria3d.pctl.dataset.utils

myria3d.pctl.dataset.utils.find_file_in_dir(data_dir: str, basename: str) → str[source]

Query files matching a basename in input_data_dir and its subdirectories. :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.find_file_in_dir.input_data_dir: data directory :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.find_file_in_dir.input_data_dir: str

Returns: first file path matching the query.
Return type: [str]

myria3d.pctl.dataset.utils.get_metadata(las_path: str) → dict[source]

returns metadata contained in a las file :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_metadata.las_path: input LAS path to get metadata from. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_metadata.las_path: str

Returns: the metadata.
Return type: dict

myria3d.pctl.dataset.utils.get_pdal_info_metadata(las_path: str) → Dict[source]

Read las metadata using pdal info :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_info_metadata.las_path: input LAS path to read. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_info_metadata.las_path: str

Returns: dictionary containing metadata from the las file
Return type: (dict)

myria3d.pctl.dataset.utils.get_pdal_reader(las_path: str, epsg: str) → pdal.pipeline.Reader.las[source]

Standard Reader. :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.las_path: input LAS path to read. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.las_path: str :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.epsg: epsg to force the reading with :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.epsg: str

Returns: reader to use in a pipeline.
Return type: pdal.Reader.las

myria3d.pctl.dataset.utils.pdal_read_las_array(las_path: str, epsg: str)[source]

Read LAS as a named array.

Parameters

las_path¶ (str) – input LAS path
epsg¶ (str) – epsg to force the reading with

Returns

named array with all LAS dimensions, including extra ones, with dict-like access.

Return type

np.ndarray

myria3d.pctl.dataset.utils.pdal_read_las_array_as_float32(las_path: str, epsg: str)[source]: Read LAS as a a named array, casted to floats.

myria3d.pctl.dataset.utils.split_cloud_into_samples(las_path: str, tile_width: numbers.Number, subtile_width: numbers.Number, epsg: str, subtile_overlap: numbers.Number = 0)[source]

Split LAS point cloud into samples.

Parameters

las_path¶ (str) – path to raw LAS file
tile_width¶ (Number) – width of input LAS file
subtile_width¶ (Number) – width of receptive field.
epsg¶ (str) – epsg to force the reading with
subtile_overlap¶ (Number, optional) – overlap between adjacent tiles. Defaults to 0.

Yields

_type_ – idx_in_original_cloud, and points of sample in pdal input format casted as floats.

myria3d.pctl.dataloader.dataloader

class myria3d.pctl.dataloader.dataloader.GeometricNoneProofCollater(follow_batch=None, exclude_keys=None)[source]: A Collater that returns None when given empty batch.

class myria3d.pctl.dataloader.dataloader.GeometricNoneProofDataloader(*args, **kwargs)[source]

Torch geometric’s dataloader is a simple torch Dataloader with a different Collater.

This overrides the collater with a NoneProof one that will not fail if some Data is None.

myria3d.pctl.points_pre_transform.lidar_hd

myria3d.pctl.points_pre_transform.lidar_hd.lidar_hd_pre_transform(points)[source]

Turn pdal points into torch-geometric Data object.

Builds a composite (average) color channel on the fly. Calculate NDVI on the fly.

Parameters: las_filepath¶ (str) – path to the LAS file.
Returns: the point cloud formatted for later deep learning training.
Return type: Data

myria3d.pctl.transforms.compose

class myria3d.pctl.transforms.compose.CustomCompose(transforms: List[Callable])[source]: Composes several transforms together. Edited to bypass downstream transforms if None is returned by a transform. :param _sphinx_paramlinks_myria3d.pctl.transforms.compose.CustomCompose.transforms: List of transforms to compose. :type _sphinx_paramlinks_myria3d.pctl.transforms.compose.CustomCompose.transforms: List[Callable]

myria3d.pctl.transforms.transforms

class myria3d.pctl.transforms.transforms.CopyFullPos[source]: Make a copy of the original positions - to be used for test and inference.

class myria3d.pctl.transforms.transforms.CopyFullPreparedTargets[source]: Make a copy of all, prepared targets - to be used for test.

class myria3d.pctl.transforms.transforms.CopySampledPos[source]: Make a copy of the unormalized positions of subsampled points - to be used for test and inference.

class myria3d.pctl.transforms.transforms.DropPointsByClass[source]: Drop points with class -1 (i.e. artefacts that would have been mapped to code -1)

class myria3d.pctl.transforms.transforms.MaximumNumNodes(num: int)[source]

class myria3d.pctl.transforms.transforms.MinimumNumNodes(num: int)[source]

class myria3d.pctl.transforms.transforms.NormalizePos(subtile_width=50)[source]: Normalizes xy in [-1;1] range by scaling the whole point cloud (including z dim). XY are expected to be centered on zéro.

class myria3d.pctl.transforms.transforms.NullifyLowestZ[source]: Center on x and y axis only. Set lowest z to 0.

class myria3d.pctl.transforms.transforms.StandardizeRGBAndIntensity[source]

Standardize RGB and log(Intensity) features.

standardize_channel(channel_data: torch.Tensor, clamp_sigma: int = 3)[source]: Sample-wise standardization y* = (y-y_mean)/y_std. clamping to ignore large values.

class myria3d.pctl.transforms.transforms.TargetTransform(classification_preprocessing_dict: Dict[int, int], classification_dict: Dict[int, str])[source]

Make target vector based on input classification dictionnary.

Example: Source : y = [6,6,17,9,1] Pre-processed: - classification_preprocessing_dict = {17:1, 9:1} - y’ = [6,6,1,1,1] Mapped to consecutive integers: - classification_dict = {1:”unclassified”, 6:”building”} - y’’ = [1,1,0,0,0]

class myria3d.pctl.transforms.transforms.ToTensor(keys: List[str] = ['pos', 'x', 'y'])[source]: Turn np.arrays specified by their keys into Tensor.