Ideal Auto believes that VLA can achieve the goal of combining 3D and 2D vision.

554
Ideally, VLA can see the physical world completely through a combination of 3D and 2D vision, unlike VLM which can only analyze 2D images. At the same time, VLA has a complete brain system, with language and CoT (chain of thought) reasoning ability, which can see, understand and actually perform actions, which is in line with the way humans operate.