$
Responsibilities
ByteDPU team is committed to building the computing infrastructure base for ByteDance Group and Volcano Engine public cloud, and is committed to the research and development and exploration of underlying software and hardware technologies (computing/network/storage) in the next generation of cloud computing, including the production and development and cutting-edge exploration of the next generation of integrated software and hardware virtualization Hypervisor base, self-developed user-state network protocol stack, high-speed transmission protocol and applications, virtual network switches, and high-performance storage stacks. 1. Explore CCL performance optimization in large model inference and training and its application in performance analysis and GPU fault detection 2. Explore DPU NIC+CCL to achieve CCL acceleration 3. Explore GPU virtualization and pooling sharing technology under DPU to improve GPU utilization.
Qualifications
1. Bachelor degree or above, computer-related majors 2. Have experience in CCL and understand common machine learning frameworks and parallel methods 3. Be proficient in C/C++ and Python programming languages 4. Understand the basic working principles of GPU (NVIDIA, AMD or Intel) and have a certain understanding of GPU architecture 5. Those with experience in high-speed transmission protocols such as RDMA and NVLink are preferred 6. Those with experience in CUDA programming are preferred.