Responsibilities
1. Responsible for following up on the progress of cutting-edge multi-modal large model algorithms, focusing on exploring applications in the direction of international short video content security 2. Solving the practical needs of the business in content understanding and content security by improving algorithm technology 3. Utilizing multi-modal model recognition capabilities, combined with recommendation system technology, to reduce recommended content security risks 4. Specific content includes: multi-modal content understanding, multi-modal content identification, multi-modal pre-training, and content distribution strategy optimization.
Qualifications
1. Computer science, mathematics, statistics and other related majors, master degree or above is preferred 2. In-depth research in a certain field of multimedia and computer vision, including but not limited to: image search, image/video classification and recognition, image segmentation, target detection, graphic and text multi-modal model, and video-text multi-modal model, etc. 3. Priority will be given to those who are familiar with multi-modal large model work, including but not limited to llava, mini-Gemini, Qwen-VL, etc. 4. Priority will be given to those with strong practical ability, winners of competitions such as Kaggle, COCO, ImageNet, ActivityNet, etc. preference will be given to those who have published papers in top academic conferences (such as CVPR, ICCV, ECCV, etc.).