Job description
About the team
The Data-Ecommerce-Platform Governance-Smart Audit Algorithm Team focuses on the development of large models in NLP, CV, and multimodal domains.
The team aims to establish state-of-the-art (SOTA) models while delving deeply into these areas to optimize algorithms for e-commerce data, thereby enhancing business outcomes.
By refining algorithms and collaborating with business operations, the team strives to govern the quality and ecosystem of ByteDance's e-commerce products comprehensively.
This includes addressing issues such as risks, violations, and low-quality content, while also fostering an e-commerce ecosystem.
The ultimate goal is to maximize platform governance efficiency and effectiveness.
Job Responsibilities
- Large language Model Algorithm Development: Build domain-specific large language models (LLM/MLLM) for e-commerce, integrating domain knowledge to rapidly apply models to business scenarios.
- E-commerce Governance Optimization: Understand e-commerce governance scenarios deeply to improve merchant/product/video/live-stream/IPR governance through algorithm optimization.
Develop state-of-the-art intelligent review systems capable of “knowing why to reject” decisions.
- Model Enhancement: Handle tasks like data construction, foundational model enhancement, instruction fine-tuning, chain-of-thought (CoT) , and parameter-efficient fine-tuning (PEFT) to achieve optimal model performance in the e-commerce domain.
- Problem Solving for Governance Applications: Address challenges such as long text/sequence modeling, few-shot learning, content moderation, violation detection, and policy recommendation using large models and multimodal approaches.
- Model Development and Optimization: Research and optimize e-commerce-specific NLP and multimodal large models to improve multilingual, multi-task, and multi-modal algorithm performance across various e-commerce scenarios.
Minimum qualifications
- Strong Technical Background: Solid foundation in machine learning and familiarity with cutting-edge AI technologies.
Preference for candidates with high-quality academic publications or competition experience.
- Big Data Proficiency: Familiarity with big data frameworks and applications like MapReduce/Spark is preferred.
- Model Training Expertise: Experience with training and deploying TensorFlow/PyTorch models.
Knowledge of training acceleration methods such as mixed precision training and distributed training is a plus.
- Model Compression and Inference Optimization: Understanding of research and techniques for model compression and inference acceleration, including quantization, pruning, distillation, and TensorRT optimization.
Preferred Qualifications
Expertise in One of the Following Areas:
- Computer Vision (CV) & Multimodal:
In-depth knowledge in fields such as image search, classification, segmentation, detection, OCR, graph neural networks, multimodal learning, unsupervised/self-supervised learning, etc.
Experience in CV/multimodal large model projects is preferred, especially for e-commerce scenarios like video/product multimodal modeling.
Strong practical abilities, with achievements in competitions such as Kaggle, COCO, ImageNet, ActivityNet, ICPC, etc.
Publications in top-tier conferences (., CVPR, ICCV, ECCV) are a plus.
- Natural Language Processing (NLP):
Expertise in areas such as pretraining, NLU, multilingual and cross-lingual learning, NLG, transfer learning, and semi-supervised learning.
Experience in LLM-related projects and applying them to unify e-commerce NLP tasks is a plus.
Strong practical abilities, with achievements in competitions like Kaggle, GLUE, Super GLUE, CLUE, etc.
Publications in top-tier conferences (., ACL, EMNLP) are a plus.
Required Skill Profession
Computer Occupations