myria3d.pctl
Objects relative to the preprocessing and loading of Lidar data.
myria3d.pctl.datamodule.hdf5
- class myria3d.pctl.datamodule.hdf5.HDF5LidarDataModule(data_dir: str, split_csv_path: str, hdf5_file_path: str, epsg: str, points_pre_transform: typing.Optional[typing.Callable[[typing.Union[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[typing.Union[bool, int, float, complex, str, bytes]]]], torch_geometric.data.data.Data]] = None, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap_train: numbers.Number = 0, subtile_overlap_predict: numbers.Number = 0, batch_size: int = 12, num_workers: int = 1, prefetch_factor: int = 2, transforms: typing.Optional[typing.Dict[str, typing.List[typing.Callable]]] = None, **kwargs)[source]
Datamodule to feed train and validation data to the model.
- property dataset: myria3d.pctl.dataset.hdf5.HDF5Dataset
Abstraction to ease HDF5 dataset instantiation.
- Parameters
las_paths_by_split_dict (LAS_PATHS_BY_SPLIT_DICT_TYPE, optional) – Maps split (val/train/test) to file path. If specified, the hdf5 file is created at dataset initialization time. Otherwise,a precomputed HDF5 file is used directly without I/O to the HDF5 file. This is usefule for multi-GPU training, where data creation is performed in prepare_data method, and the dataset is then loaded again in each GPU in setup method. Defaults to None.
- Returns
the dataset with train, val, and test data.
- Return type
- predict_dataloader()[source]
An iterable or collection of iterables specifying prediction samples.
For more information about multiple dataloaders, see this section.
It’s recommended that all data downloads and preparation happen in
prepare_data()
.predict()
Note
Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.
- Returns
A
torch.utils.data.DataLoader
or a sequence of them specifying prediction samples.
- prepare_data(stage: Optional[str] = None)[source]
Prepare dataset containing train, val, test data.
- setup(stage: Optional[str] = None) None [source]
Instantiate the (already prepared) dataset (called on all GPUs).
- test_dataloader()[source]
An iterable or collection of iterables specifying test samples.
For more information about multiple dataloaders, see this section.
For data processing use the following pattern:
download in
prepare_data()
process and split in
setup()
However, the above are only necessary for distributed processing.
Warning
do not assign state in prepare_data
test()
Note
Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.
Note
If you don’t need a test dataset and a
test_step()
, you don’t need to implement this method.
- train_dataloader()[source]
An iterable or collection of iterables specifying training samples.
For more information about multiple dataloaders, see this section.
The dataloader you return will not be reloaded unless you set
reload_dataloaders_every_n_epochs
to a positive integer.For data processing use the following pattern:
download in
prepare_data()
process and split in
setup()
However, the above are only necessary for distributed processing.
Warning
do not assign state in prepare_data
fit()
Note
Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.
- val_dataloader()[source]
An iterable or collection of iterables specifying validation samples.
For more information about multiple dataloaders, see this section.
The dataloader you return will not be reloaded unless you set
reload_dataloaders_every_n_epochs
to a positive integer.It’s recommended that all data downloads and preparation happen in
prepare_data()
.fit()
validate()
Note
Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.
Note
If you don’t need a validation dataset and a
validation_step()
, you don’t need to implement this method.
myria3d.pctl.dataset.hdf5
- class myria3d.pctl.dataset.hdf5.HDF5Dataset(hdf5_file_path: str, epsg: str, las_paths_by_split_dict: typing.Dict[typing.Union[typing.Literal['train'], typing.Literal['val'], typing.Literal['test']], typing.List[str]], points_pre_transform: typing.Callable = <function lidar_hd_pre_transform>, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap_train: numbers.Number = 0, pre_filter=<function pre_filter_below_n_points>, train_transform: typing.Optional[typing.List[typing.Callable]] = None, eval_transform: typing.Optional[typing.List[typing.Callable]] = None)[source]
Single-file HDF5 dataset for collections of large LAS tiles.
- property samples_hdf5_paths
Index all samples in the dataset, if not already done before.
- myria3d.pctl.dataset.hdf5.create_hdf5(las_paths_by_split_dict: dict, hdf5_file_path: str, epsg: str, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, subtile_overlap_train: numbers.Number = 0, points_pre_transform: typing.Callable = <function lidar_hd_pre_transform>)[source]
Create a HDF5 dataset file from las.
- Parameters
las_paths_by_split_dict¶ ([LAS_PATHS_BY_SPLIT_DICT_TYPE]) – should look like las_paths_by_split_dict = {‘train’: [‘dir/las1.las’,’dir/las2.las’], ‘val’: […], , ‘test’: […]},
tile_width¶ (Number, optional) – width of a LAS tile. 1000 by default,
subtile_width¶ – (Number, optional): effective width of a subtile (i.e. receptive field). 50 by default,
pre_filter¶ – Function to filter out specific subtiles. “pre_filter_below_n_points” by default,
subtile_overlap_train¶ (Number, optional) – Overlap for data augmentation of train set. 0 by default,
points_pre_transform¶ (Callable) – Function to turn pdal points into a pyg Data object.
myria3d.pctl.dataset.iterable
- class myria3d.pctl.dataset.iterable.InferenceDataset(las_file: str, epsg: str, points_pre_transform: typing.Callable[[typing.Union[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[typing.Union[bool, int, float, complex, str, bytes]]]], torch_geometric.data.data.Data] = <function lidar_hd_pre_transform>, pre_filter: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], bool]] = <function pre_filter_below_n_points>, transform: typing.Optional[typing.Callable[[torch_geometric.data.data.Data], torch_geometric.data.data.Data]] = None, tile_width: numbers.Number = 1000, subtile_width: numbers.Number = 50, subtile_overlap: numbers.Number = 0)[source]
Iterable dataset to load samples from a single las file.
myria3d.pctl.dataset.toy_dataset
Generation of a toy dataset for testing purposes.
- myria3d.pctl.dataset.toy_dataset.make_toy_dataset_from_test_file()[source]
Prepare a toy dataset from a single, small LAS file.
The file is first duplicated to get 2 LAS in each split (train/val/test), and then each file is splitted into .data files, resulting in a training-ready dataset loacted in td_prepared
- Parameters
- Returns
path to directory containing prepared dataset.
- Return type
myria3d.pctl.dataset.utils
- myria3d.pctl.dataset.utils.find_file_in_dir(data_dir: str, basename: str) str [source]
Query files matching a basename in input_data_dir and its subdirectories. :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.find_file_in_dir.input_data_dir: data directory :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.find_file_in_dir.input_data_dir: str
- Returns
first file path matching the query.
- Return type
[str]
- myria3d.pctl.dataset.utils.get_metadata(las_path: str) dict [source]
returns metadata contained in a las file :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_metadata.las_path: input LAS path to get metadata from. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_metadata.las_path: str
- Returns
the metadata.
- Return type
- myria3d.pctl.dataset.utils.get_pdal_info_metadata(las_path: str) Dict [source]
Read las metadata using pdal info :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_info_metadata.las_path: input LAS path to read. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_info_metadata.las_path: str
- Returns
dictionary containing metadata from the las file
- Return type
(dict)
- myria3d.pctl.dataset.utils.get_pdal_reader(las_path: str, epsg: str) pdal.pipeline.Reader.las [source]
Standard Reader. :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.las_path: input LAS path to read. :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.las_path: str :param _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.epsg: epsg to force the reading with :type _sphinx_paramlinks_myria3d.pctl.dataset.utils.get_pdal_reader.epsg: str
- Returns
reader to use in a pipeline.
- Return type
pdal.Reader.las
- myria3d.pctl.dataset.utils.pdal_read_las_array(las_path: str, epsg: str)[source]
Read LAS as a named array.
- myria3d.pctl.dataset.utils.pdal_read_las_array_as_float32(las_path: str, epsg: str)[source]
Read LAS as a a named array, casted to floats.
- myria3d.pctl.dataset.utils.split_cloud_into_samples(las_path: str, tile_width: numbers.Number, subtile_width: numbers.Number, epsg: str, subtile_overlap: numbers.Number = 0)[source]
Split LAS point cloud into samples.
- Parameters
- Yields
_type_ – idx_in_original_cloud, and points of sample in pdal input format casted as floats.
myria3d.pctl.dataloader.dataloader
myria3d.pctl.points_pre_transform.lidar_hd
myria3d.pctl.transforms.compose
- class myria3d.pctl.transforms.compose.CustomCompose(transforms: List[Callable])[source]
Composes several transforms together. Edited to bypass downstream transforms if None is returned by a transform. :param _sphinx_paramlinks_myria3d.pctl.transforms.compose.CustomCompose.transforms: List of transforms to compose. :type _sphinx_paramlinks_myria3d.pctl.transforms.compose.CustomCompose.transforms: List[Callable]
myria3d.pctl.transforms.transforms
- class myria3d.pctl.transforms.transforms.CopyFullPos[source]
Make a copy of the original positions - to be used for test and inference.
- class myria3d.pctl.transforms.transforms.CopyFullPreparedTargets[source]
Make a copy of all, prepared targets - to be used for test.
- class myria3d.pctl.transforms.transforms.CopySampledPos[source]
Make a copy of the unormalized positions of subsampled points - to be used for test and inference.
- class myria3d.pctl.transforms.transforms.DropPointsByClass[source]
Drop points with class -1 (i.e. artefacts that would have been mapped to code -1)
- class myria3d.pctl.transforms.transforms.NormalizePos(subtile_width=50)[source]
Normalizes xy in [-1;1] range by scaling the whole point cloud (including z dim). XY are expected to be centered on zéro.
- class myria3d.pctl.transforms.transforms.NullifyLowestZ[source]
Center on x and y axis only. Set lowest z to 0.
- class myria3d.pctl.transforms.transforms.StandardizeRGBAndIntensity[source]
Standardize RGB and log(Intensity) features.
- standardize_channel(channel_data: torch.Tensor, clamp_sigma: int = 3)[source]
Sample-wise standardization y* = (y-y_mean)/y_std. clamping to ignore large values.
- class myria3d.pctl.transforms.transforms.TargetTransform(classification_preprocessing_dict: Dict[int, int], classification_dict: Dict[int, str])[source]
Make target vector based on input classification dictionnary.
Example: Source : y = [6,6,17,9,1] Pre-processed: - classification_preprocessing_dict = {17:1, 9:1} - y’ = [6,6,1,1,1] Mapped to consecutive integers: - classification_dict = {1:”unclassified”, 6:”building”} - y’’ = [1,1,0,0,0]