Search by job, company or skills

B

Hardware Accelerated Inference Engine Runtime Development Engineer-Data (Shenzhen)

3-5 Years

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 20 days ago

Job Description

Responsibilities

1. Design and implement the core runtime components of the inference engine, including model loading, graph optimization, operator scheduling, memory management, etc. 2. Develop and maintain the runtime library of the inference engine, supporting multiple deep learning frameworks (such as TensorFlow, PyTorch, ONNX, etc.) analyze and solve the performance bottlenecks of the inference engine at runtime, improve throughput and reduce latency 3. Optimize the performance of the inference engine for different hardware platforms (CPU, GPU, NPU, etc.), including operator optimization, memory optimization and calculation graph optimization 4. Develop and maintain cross-platform support for the inference engine to ensure that it runs stably on multiple operating systems (Linux, Windows, embedded systems, etc.) and hardware architectures 5. Develop and maintain the compilation tool chain of the inference engine to support optimization technologies such as model conversion, quantification, and pruning provide debugging and profiling tools to help developers analyze and optimize inference performance 6. Cooperate with algorithm and product teams to support the rapid integration and deployment of new models and new operators to ensure that the inference engine meets actual business needs.

Qualifications

1. Bachelor degree or above in computer science, electrical engineering, mathematics or related fields more than 3 years of experience in the development of inference engines, compilers or high-performance computing 2. Familiar with the C++/Python programming language, with solid programming capabilities and code optimization experience 3. Familiar with the runtime mechanism of deep learning frameworks (such as TensorFlow, PyTorch, ONNX, etc.) 4. Familiar with computer architecture and hardware architectures such as CPU, GPU, NPU, etc. 5. Familiar with multi-thread programming, memory management and performance optimization technology 6. Good communication skills and teamwork spirit, with strong problem analysis and solution capabilities. The following are bonus points: 1. Familiar with model compression technologies such as model quantization, pruning, and distillation 2. Familiar with compiler technologies such as LLVM and MLIR 3. Experience in contributing to open source projects 4. Experience in developing edge computing devices or embedded systems 5. Experience in developing or optimizing inference engines (such as TensorRT, OpenVINO, TVM, etc.).

About Company

ByteDance is a technology company operating a range of content platforms that inform, educate, entertain and inspire people across languages, cultures, and geographies.
Dedicated to building global platforms of creation and interaction, ByteDance now has a portfolio of applications available in over 150 markets and 75 languages. For example, TikTok, Helo, Vigo Video, Douyin, and Huoshan.
Dedicated to building global platforms of creation and interaction, ByteDance now has a portfolio of applications available in over 150 markets and 75 languages. For example, TikTok, Helo, Vigo Video, Douyin, and Huoshan.

Job ID: 106292433