Mayank Bansal, PhD, is a Principal Applied Scientist at AWS Just-Walk-Out (JWO)/Amazon Go Research where he is incubating framework and product vision (with Dr. Greg Hager) for video-centric problem solving by combining Large Language Models (LLMs) with Large Vision and Vision-Language Foundation Models (VLMs), and utilizing multimodal AGI in a continual learning loop. He has also solved problems in generalized, composable multi-camera person identity tracking e.g. disjoint multi-camera Deep-ReID under edge-compute constraints, single- and multi-camera localization etc. and established the framework (with Dr. Gerard Medioni) and led the development for streaming end-to-end learning [patent pending] for next generation “Just Walk Out” technology stack.
Prior to Amazon, he was a Staff Research Scientist at Waymo – formerly the Google Self-Driving Car Project. In this role, he led efforts on end-to-end deep-learning techniques for planning and prediction. Before that, he focused on the perception stack, solving problems like detection, tracking and localization of emergency vehicles, and detection of turn-signals, hazard lights and various other signals on other vehicles in the scene. Prior to Waymo, he was a Principal Research Scientist at the Center for Vision Technologies, SRI International where he led several computer vision and robotics research and developement (R&D) programs for a range of government (DARPA, NGA, FHWA, NIH etc.) and commercial (Google Inc., Autoliv Inc.) clients. In parallel with this full-time role, he pursued a full-time PhD in Computer Vision from the University of Pennsylvania from 2010-2014 and developed novel techniques for matching disparate views of a given 3D scene.
Bansal has more than 20 years of experience in 3D/2D object detection and recognition from LiDAR and EO/IR mono/stereo camera data as well as deep-learning modeling and research experience for a variety of computer vision and robotics applications. He has significant expertise in geometric and stereo vision, perception for mobile robotics, medical image analysis and geo-localization techniques from a variety of input sources.
Bansal is a Senior Member of the IEEE and the IEEE Computer Society. He has published close to 40 papers in peer-reviewed conferences and journals and holds twelve patents.
Bansal received his PhD in Computer & Information Sciences from the University of Pennsylvania. His Masters and Bachelors degrees in Computer Science & Engineering are from the Indian Institute of Technology (IIT) Delhi, New Delhi, India where he was awarded the Institute Silver Medal for obtaining the 1st rank in the department of Computer Science.
Learning to Drive: Beyond Pure Imitation
By Mayank Bansal and Abhijit Ogale – Waymo Research. At Waymo, we are focused on building the world’s most experienced driver. And just like any good driver, our vehicle needs to perceive and understand the world around it by recognizing surrounding objects and predicting what they might do next, before deciding how to drive safely while obeying the traffic rules.
A Novel Zero-shot Approach for Detecting User-specified Objects in Images Leveraging Vision Foundation Models and Large Language Models
With Dr. Greg Hager
US Patent Application Filed
Detecting Events by Streaming Pooled Location Features from Cameras
With Dr. Gerard Medioni
US Patent Application Filed
Object Localization for Autonomous Driving by Visual Tracking and Image Reprojection
Issued Dec 26, 2023 US 11854229 B2
Issued Aug 11, 2022 [US-20220253066-A1]
Issued May 31, 2022 [US-11347231-B2]
Issued Feb 11, 2021 [US-20210041883-A1]
Behavior Prediction of Surrounding Agents
With Dr. Dragomir Anguelov
Issued Oct 26, 2023 US-20230343107-A1
Issued Aug 15, 2023 [US-11727690-B2]
Issued Oct 07, 2021 [US-20210312177-A1]
Detection of Emergency Vehicles
Issued Aug 15, 2023 US-11727692-B2
Issued Apr 28, 2022 [US-20220130133-A1]
Issued Jan 04, 2022 [US-11216689-B2]
Issued Feb 04, 2021 [US-20210034914-A1]
Agent Trajectory Prediction Using Anchor Trajectories
Issued Jul 27, 2023 US-20230234616-A1
Issued Apr 04, 2023 [US-11618481-B2]
Issued Jan 07, 2021 [US-20210001897-A1]
Neural Networks with Attentional Bottlenecks for Trajectory Planning
Issued Jan 31, 2023 US-11565715-B2
Issued Mar 18, 2021 [US-20210078594-A1]
Real-time Human-machine Collaboration using Big Data Driven Augmented Reality Technologies
Issued Jul 26, 2022 US-11397462-B2
Issued Dec 29, 2016 [US-20160378861-A1]
Neural Networks for Vehicle Trajectory Planning
Issued Feb 22, 2022 US-11256983-B2
Issued Jan 05, 2021 [US-10883844-B2]
Issued Jun 04, 2020 [US-20200174490-A1]
Issued Jan 31, 2019 [US-20190034794-A1]
Issued Jan 31, 2019 [US-20190033085-A1]
Multi-dimensional Realization of Visual Content of an Image Collection
Issued Jul 27, 2021 US-11074477-B2
Issued Jun 23, 2020 [US 10691743 B2]
Issued Dec 14, 2017 [US-20170357878-A1]
Issued Aug 22, 2017 [US 9740963-B2]
Issued Feb 11, 2016 [US-20160042252-A1]
Issued Feb 11, 2016 [US-20160042253-A1]
Method and Apparatus for Correlating and Viewing Disparate Data
Issued Sep 4, 2018 US 10068024 B2
Issued Feb 7, 2017 [US 9563623]
Issued Apr 21, 2016 [US-20160110433-A1]
Issued Jun 9, 2015 [US 9053194 B2]
Method and Apparatus for Real-time Pedestrian Detection for Urban Driving
Issued Oct 14, 2014 US 8861842 B2
System and Method of Detecting Objects
Issued Feb 26, 2013 US 8385599 B2
Apparatus and Method for Object Detection and Tracking and Roadway Awareness using Stereo Cameras
Issued Jan 31, 2012 US 8108119