Overview
Machine Learning System Engineer - Data AML - Soaring Star Talent Program at ByteDance.
Join to apply for the Machine Learning System Engineer - Data AML role at ByteDance.
Responsibilities
- Team Introduction: Data AML is ByteDance's machine learning middle platform, providing training and inference systems for recommendation, advertising, CV, speech, and NLP across Douyin, Toutiao, and Xigua Video.
AML provides machine learning computing capabilities to internal business units and conducts research on general and innovative algorithms to solve key business challenges.Through Volcano Engine, it delivers core machine learning and recommendation system capabilities to external enterprise clients.
Beyond business applications, AML is also engaged in cutting-edge research in AI for Science and scientific computing.
- Research Project Introduction: Large-scale recommendation systems are applied to short video, text community, image and other products, with modal information serving as a generalization feature to support business scenarios such as recommendation.
The work explores multimodal cotraining, 7B/13B large-scale parameter models, and longer sequence end-to-end development based on algorithm-engineering CoDesign.
- Engineering and Algorithmic Research Directions: Develop multimodal sample representations; build high-performance multimodal inference engines based on PyTorch; develop high-performance multimodal training frameworks; apply heterogeneous hardware in multimodal recommendation systems.
Design architectures for recommendation-advertising and multimodal cotraining; work with Sparse Mixture of Experts (Sparse MOE), Memory Networks, and Hybrid precision techniques.
Qualifications
- Doctoral degree preferred, with priority for candidates in Computer Science, Software Engineering, or related fields.
- Proficiency in one or more programming languages (C/C++/Go/Python/Java) in a Linux environment.
- Deep understanding of distributed systems; experience designing, developing, and maintaining large-scale distributed systems.
- Strong analytical and logical skills, ability to abstract and decompose complex business logic, and collaborative team spirit.
- Strong sense of responsibility, learning ability, communication skills, and self-motivation.
- Good technical documentation habits, including timely writing and updating of work processes and technical docs.
Bonus Qualifications
- Familiarity with Kubernetes architecture and cloud-native system development.
- Experience with TensorFlow, PyTorch, MXNet, or other mainstream ML frameworks.
- Familiarity with Django, Flask, or related technologies, with backend development experience.
- Experience in one or more areas: AI Infrastructure, HW/SW Co-Design, High-Performance Computing, ML Hardware Architecture (GPU, accelerators, networking), ML for Systems, Distributed Storage, or large-scale cloud/private cloud platforms.
About Us
Founded in 2012, ByteDance's mission is to inspire creativity and enrich life.
With products including TikTok, Lemon8, CapCut, Pico, Toutiao, Douyin, and Xigua, ByteDance connects people to create and consume content.
Why Join ByteDance
Inspiring creativity is at the core of ByteDance's mission.
Our innovative products help people express themselves, discover, and connect.
We foster a global, diverse team and an Always Day 1 mindset to achieve meaningful breakthroughs for ourselves, our company, and our users.
Diversity & Inclusion: ByteDance is committed to an inclusive workplace where employees are valued for their skills and perspectives.
We celebrate diverse voices and strive to reflect the communities we reach.
#J-18808-Ljbffr