SenseTime releases the multi-modal, multi-task general-purpose large model "Scholar 2.5"

0
SenseTime recently released a multi-modal, multi-task general-purpose large model named "Shusheng 2.5", which has 3 billion parameters and is the most accurate and largest model in the world's open source models for ImageNet. The model achieved a score of more than 65.0 mAP in the object detection benchmark dataset COCO, providing efficient and accurate perception and understanding support for general scene tasks such as autonomous driving and robots. Currently, "Shusheng 2.5" has been released on the OpenGVLab open source platform.