Pre-process Data

The Pre-process data step will help ensure that the data is ready to be uploaded in a tool ingestible format for the Data Import step. The dataset must segregated in a folder and stored in an Amazon S3 bucket

  1. Format Data - Pointcloud formated to .las

  2. Segregate Data - Folder structure

  3. Store Data - Amazon S3 Bucket

Format Point Cloud Data

  • The tool supports LiDAR & RaDar datasets in .las format only.

  • The point cloud datasets can be in formats such as .bin .pcd .json .txt

    • Data in .bin format first will be coverted to .pcd and then further converted to .las

      • Read about .bin to .pcd conversion for NuScenes data set here

      • CloudCompare can be used to convert .pcd to .las using the following command:

An example on how to convert pcd file in las:

cloudcompare.CloudCompare -SILENT -O <filename>.pcd -C_EXPORT_FMT LAS -SAVE_CLOUDS FILE <filename>.las

Segregate Data into Folders

The way the data is segregated impacts the visibility of tasks that load on the tool for the labeling experts. Below is the terminology that will be frequently used in this document to segregate data correctly.

Data Terminology

Task

A task is defined as the labeling work performed on one frame that loads on the annotation tool.

Frame

A frame is a visual dataset that loads on the annotation tool that has Image data along with its respective sensor data (LiDAR, RaDar etc..)

Batch

A batch (or sequence) is the collective set of multiple frames that load on the annotation tool for a single expert is called a batch. The size of a batch can vary between 1 and n.

  • Submission happens for a Batch.

Data Reflection on Tool

To understand segregation better, consider the following example:

There are 100 frames in a sequence which need to be annotated. The desired number of frames that a single batch should load is 10 at most. This limit of frames in a batch that loads on the tool is set at the time of setting a batch limit at the time of importing data.

Here is the representation of this batch on the tool.

  • There is batch of 10 frames (BLUE).

  • Each frame has 1 point cloud (ORANGE) and 3 camera images (GREEN) linked to it.

  • Each camera is synced with the point cloud (ORANGE). The images will be synchronized with the corresponding point cloud based on the availability of calibration details.

  • The LiDAR point cloud may include pre-existing labeled data. (PINK)

Using the help of the above example, the Data Folder will need to be reorganzied in the following format:

  • Respective Camera Data - Green folders

  • LiDAR data - Orange folder

  • Calibration - Red folder (preferable to have calibration to sync the camera with point cloud for quicker reference)

  • Pre-labelled annotation data - Pink folder (when prelabelled data is available)

  • Velocity/ Ego vehicle data - White folder (optional based on output required)

Preparing Data Folder

  • The data can be store all together in one Folder or multiple folders

    • Option 1: Prepare 1 folder with all 100 frames data

      • Ideally all frames belonging to a single sequence should be stored together.

    • Option 2: Prepare multiple folders to divide 100 frame's data.

      • This is useful when the frames in a batch are not sequenced

  • The folder name should not have any space Eg: Folder_1

Step 1: Create Camera Folders

  • Create image folders for each camera sensor respectively in Folder 1. For example, Camera 1 , Camera 2 , Camera 3... Camera n

  • The Camera folder name will reflect on the respective camera images fetched on the annotation tool. Hence, ensure the camera folders are named appropriately to help provide context to the subject expert.

  • The image files' format should be in .jpeg or .png format.

  • Each camera folder should contain all the images belonging to that camera sensor across all the Frames stored in Folder_1

  • The image files in the Camera folders should have identical names if they were captured at the same instance. For example, in Frame 1 , Camera 1 , Camera 2 and Camera 3 will all have the image filename saved as xyz_timestamp1.jpeg.

Step 2: Create LiDAR Folder

  • Create a folder containing files of the point cloud data across all the frames in Folder_1.

  • The point cloud folder can be named arbitrarily. For consistency, consider naming it LiDAR or PCT.

  • The point cloud data must be in .las format, as this is the supported format for the annotation tool. Refer to the guide on Format LiDAR Data for more details.

  • The folder should include all the point cloud files corresponding to all frames, organized within Folder_1

  • The point cloud file in the LiDAR folder must have the same name as the images from the cameras associated with that frame. For instance, if the image files for all three cameras (stored in their respective camera folders i.e. Camera 1 , Camera 2 and Camera 3) are named xyz_timestamp1.jpeg for Frame 1, then the point cloud file for that frame in the LiDAR folder must be named xyz_timestamp1.las.

Step 3: Create Calibration Folder

  • When camera sensors and point cloud files are available, a calibration file may also be present. If it exists, create a dedicated folder to store all calibration data in .json format.

  • This folder must be named as calibration.

  • If the calibration is identical for all camera sensors, then store 1 calibration file in it called calibration.json

  • If the calibration data varies for all camera sensors, then either:

    • Prepare one separate file for each camera sensor and save it under the calibration folder OR,

    • Prepare one .json file with a separate block for each camera calibration.

  • Compute the calibration matrix by multiplying the cam_intrinsic matrix with the inverse of camera_extrinsic

calibration_matrix = camera_intrinsic * inverse(camera_extrinsic)
json format for calibration file:
{
   "matrices": [
       {
           "fromWorld": {
               "elements": [ //All element's components are in column Major format
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0,
                   0
               ]
           },
           "name": "LIDAR_TOP"
       },
       {
           "fromWorld": {
               "elements": [
                   839.3693313296216,
                   480.50874358343594,
                   0.9999117986552755,
                   0,
                   -1244.2586937950568,
                   20.203096824064712,
                   0.010155927635204103,
                   0,
                   -8.2467494447129,
                   -1248.651045792533,
                   0.008558740785910282,
                   0,
                   -1427.154718970285,
                   1039.0897354143844,
                   -1.7346966604269405,
                   1
               ]
           },
           "name": "CAM_FRONT"
       }
   ]
}

Step 4: Create a Folder for Pre-Labelled Annotations

  1. Store the prelabled annotations in the lidar_annotation folder.

For example, if you have 100 frames, the corresponding files should be named as: 1.json, 2.json, 3.json, and so on up to 100.json.

Prelabled annotations are stored as JSON files, one json per frame

The JSON schema needed to create prelabled files (supports cuboid, 2D bbox, 3Dpolyline).

json schema for prelabled file:
{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "title": "Annotation File",
    "type": "object",
    "properties": {
        "annotations": {
            "type": "array",
            "items": {
                "type": "object",
                "oneOf": [
                    {
                        "if": {
                            "properties": {
                                "object_type": {
                                    "const": "rectangle"
                                }
                            }
                        },
                        "then": {
                            "required": [
                                "object_type",
                                "geometry"
                            ],
                            "properties": {
                                "object_type": {
                                    "const": "rectangle"
                                },
                                "class": {
                                    "type": "string"
                                },
                                "identity": {
                                    "type": [
                                        "integer",
                                        "string"
                                    ]
                                },
                                "reference_folder": {
                                    "type": "string"
                                },
                                "geometry": {
                                    "type": "object",
                                    "properties": {
                                        "coordinates": {
                                            "type": "array",
                                            "items": {
                                                "type": "object",
                                                "properties": {
                                                    "x": {
                                                        "type": "number"
                                                    },
                                                    "y": {
                                                        "type": "number"
                                                    }
                                                },
                                                "required": [
                                                    "x",
                                                    "y"
                                                ]
                                            },
                                            "minItems": 4,
                                            "maxItems": 4
                                        }
                                    },
                                    "required": [
                                        "coordinates"
                                    ]
                                },
                                "taxonomy_attribute": {
                                    "type": "object"
                                }
                            }
                        }
                    },
                    {
                        "if": {
                            "properties": {
                                "object_type": {
                                    "const": "cuboid"
                                }
                            }
                        },
                        "then": {
                            "required": [
                                "object_type",
                                "geometry"
                            ],
                            "properties": {
                                "object_type": {
                                    "const": "cuboid"
                                },
                                "id": {
                                    "type": "string"
                                },
                                "class": {
                                    "type": "string"
                                },
                                "identity": {
                                    "type": [
                                        "integer",
                                        "string"
                                    ]
                                },
                                "classId": {
                                    "type": [
                                        "string",
                                        "integer"
                                    ]
                                },
                                "geometry": {
                                    "type": "object",
                                    "properties": {
                                        "position": {
                                            "type": "object",
                                            "properties": {
                                                "x": {
                                                    "type": "number"
                                                },
                                                "y": {
                                                    "type": "number"
                                                },
                                                "z": {
                                                    "type": "number"
                                                }
                                            },
                                            "required": [
                                                "x",
                                                "y",
                                                "z"
                                            ]
                                        },
                                        "rotation": {
                                            "type": "object",
                                            "properties": {
                                                "x": {
                                                    "type": "number"
                                                },
                                                "y": {
                                                    "type": "number"
                                                },
                                                "z": {
                                                    "type": "number"
                                                }
                                            },
                                            "required": [
                                                "x",
                                                "y",
                                                "z"
                                            ]
                                        },
                                        "boxSize": {
                                            "type": "object",
                                            "properties": {
                                                "x": {
                                                    "type": "number"
                                                },
                                                "y": {
                                                    "type": "number"
                                                },
                                                "z": {
                                                    "type": "number"
                                                }
                                            },
                                            "required": [
                                                "x",
                                                "y",
                                                "z"
                                            ]
                                        }
                                    },
                                    "required": [
                                        "position",
                                        "rotation",
                                        "boxSize"
                                    ]
                                },
                                "taxonomy_attribute": {
                                    "type": "object"
                                },
                                "isGeometryKeyFrame": {
                                    "type": "boolean"
                                },
                                "isAttributeKeyFrame": {
                                    "type": "boolean"
                                }
                            }
                        }
                    },
                    {
                        "if": {
                            "properties": {
                                "object_type": {
                                    "const": "polyline"
                                }
                            }
                        },
                        "then": {
                            "required": [
                                "object_type",
                                "geometry"
                            ],
                            "properties": {
                                "object_type": {
                                    "const": "polyline"
                                },
                                "id": {
                                    "type": "string"
                                },
                                "identity": {
                                    "type": [
                                        "integer",
                                        "string"
                                    ]
                                },
                                "geometry": {
                                    "type": "object",
                                    "properties": {
                                        "points": {
                                            "type": "array",
                                            "items": {
                                                "type": "object",
                                                "properties": {
                                                    "position": {
                                                        "type": "object",
                                                        "properties": {
                                                            "x": {
                                                                "type": "number"
                                                            },
                                                            "y": {
                                                                "type": "number"
                                                            },
                                                            "z": {
                                                                "type": "number"
                                                            }
                                                        },
                                                        "required": [
                                                            "x",
                                                            "y",
                                                            "z"
                                                        ]
                                                    }
                                                },
                                                "required": [
                                                    "position"
                                                ]
                                            }
                                        },
                                        "thickness": {
                                            "type": "number"
                                        }
                                    },
                                    "required": [
                                        "points",
                                        "thickness"
                                    ]
                                },
                                "taxonomy_attribute": {
                                    "type": "object"
                                }
                            }
                        }
                    }
                ]
            }
        }
    },
    "required": [
        "annotations"
    ]
}

Sample/snippet of Cuboid, 2D Bbox, 3D Polyline

{
  "annotations": [
    {
      "id" : "10e3547e-0ffc-11f0-beff-c9af1eb6b655", // Optional -- UUIDv4 ID for all annotations across sequence
      "class": "car",
      "object_type": "cuboid",
      "taxonomy_attribute": {},
      "geometry": {
        "position": { // Position in metres
          "x": 5.992325288342432,
          "y": 4.904602559666202,
          "z": 1.5813289166406617
        },
        "rotation": { // Rotation in Euler angles (in radian) with ZYX order 
          "x": 0,
          "y": 0,
          "z": 0
        },
        "boxSize": { 
          "x": 4.837695807772452,
          "y": 4.519224086476998,
          "z": 2.4204508776208957
        }
      },
      "identity": 1, // Identity must start from 1
      "isGeometryKeyFrame": true, // Geometry key frame remains untouched with interpolation 
    }
  ]
}

Step 5: Create a folder for ego data

Ego pose data is used to enable features such as vehicle velocity, merged point cloud etc. To calculate the reference velocity of objects around the ego vehicle, the ego data information for each frame should be provided with the dataset.

  • Create a folder within Folder_1 containing files of the ego data for each frame.

  • This folder must be named as ego_data

  • The ego data files in the ego_data folder must have the same name as the point cloud file corresponding to that frame. For example, if the point cloud file for Frame 1 is named xyz_timestamp.las in the LiDAR folder, then the ego data file should also be named xyz_timestamp.json in the ego_data folder.

  • To capture the velocity of objects around the ego vehicle, each file within the folder must include the "timestamp_epoch_ns" information

    • timestamp_epoch_ns is the timestamp at which each frame is captured

    • It is represented as a Unix epoch timestamp in nanoseconds (ns).

  • To facilitate merge point cloud functionality, the ego data information for each frame will be either calculated using the ICP Vanilla registration algorithm or is provided with the dataset and needs to be placed in the ego_data folder.

Snippet of the ego data
{
 "ego": {
   "timestamp_epoch_ns": 50904429,
   "utmHeading_deg": 0.0,
   "utmX_m": 0.0,
   "utmY_m": 0.0,
   "utmZ_m": 0.0
 }
}
  • This file should include the "utmHeading_deg,utmX_m,utmY_m,utmZ_m" information.

    • Translation (x, y, z):

      • prev_utmX_m: The distance the object has moved along the x-axis (in meters) with respect to the 1st frame.

      • prev_utmY_m: The distance the object has moved along the y-axis (in meters) with respect to the 1st frame.

      • prev_utmZ_m: The distance the object has moved along the z-axis (in meters) with respect to the 1st frame.

    • Rotation (yaw, pitch, roll):

      • prev_utmHeading_deg: The angle of rotation around the yaw axis (in degrees) with respect to the 1st frame.

Last updated