Going Forward with the Perception Stack

Description

Having #1314 (closed) issue related MR's merged, current object tracking pipeline works like following :

Euclidean clustering outputs cloud clusters. cloud clusters have an autoware_auto_msgs::msg::PointClusters message type.
tracking_nodes subscribes to cloud clusters.
In the MultiObjectTracker module, tracking nodes converts cloud clusters to DetectedObjects message type. It calculates the bounding boxes with an "Efficient L-shape fitting of laser scanner data for vehicle pose estimation" paper based algorithm.
Tracker works with DetectedObjects.
Tracked objects get labeled with the image detections. The bounding box shape is projected onto the image. IOUHeuristic function returns the IoU between detection rois and tracked objects. According to IoU calculation, greedy algorithm assign rois and tracked object.
tracking_nodes calculate convex hull polygon prism for unlabeled tracked objects.

But, bounding boxes coming from shape fitting algorithm is not stable. Also everything that is projected to the image detections are assumed boxes. I think, we could apply the alternative solution proposed by Igor in this issue #1314 (closed).

Proposed solution;

Estimate polygons with height (prisms) in euclidean_cluster_nodes
- Clustering should output prisms along with the cluster points, not only cubes. This represents the clusters better than oversized cubes.
- Edit autoware_auto_msgs::msg::PointClusters message to contain cluster polygon information.
Port tier4's AutowareArchitectureProposal.iv shape_estimation module to Autoware.Auto.It's based on "Efficient L-Shape Fitting for Vehicle Detection Using Laser Scanners" and improved by tier4.
- From our observations this algorithm gives better boxes for the vehicles.
- Also outputs cylinders for the pedestrians
Project cluster prisms onto the image, instead of bounding box shape.
- Because prisms have finer resolution for the clusters and they don't increase the algorithmic complexity by much
Estimate tracked object shape for labelled objects using shape_estimation and final shape will be either:
- Box (For vehicles)
- Polygon With Height (Prism) (Unlabeled objects)
- Cylinder (For pedestrians)

I'm curious for your valuable opinions on this topic. @niosus @xmfcx @mitsudome-r @frederik.beaujean @gowtham.ranganathan

Edited Nov 12, 2021 by Kaan Colak