Real-time visual understanding for machines that navigate the world. Object detection, drivable area segmentation, and monocular depth estimation trained on BDD100K driving data.
Real-time detection of vehicles, pedestrians, and traffic signs using transformer-based architecture.
Pixel-level drivable area classification for safe navigation using encoder-decoder architecture.
Monocular depth inference from a single camera frame for distance estimation.
Berkeley Deep Drive dataset with 100K driving videos and rich annotations for detection, segmentation, and drivable area classification across diverse weather and lighting conditions.