Machine Learning Engineer, Generative AI Innovation Center
Description The Generative AI Innovation Center at AWS empowers customers to harness state of the art AI technologies for transformative business opportunities.
Our multidisciplinary team of strategists, scientists, engineers, and architects collaborates with customers across industries to fine-tune and deploy customized generative AI applications at scale.
Additionally, we work closely with foundational model providers to optimize AI models for Amazon Silicon, enhancing performance and efficiency.
As an SDE on our team, you will drive the development of custom Large Language Models (LLMs) across languages, domains, and modalities.
You will be responsible for fine-tuning state-of-the-art LLMs for diverse use cases while optimizing models for high-performance deployment on AWS’s custom AI accelerators.
This role offers an opportunity to innovate at the forefront of AI, tackling end-to-end LLM training pipelines at massive scale and delivering next-generation AI solutions for top AWS clients.
Key job responsibilities:
Large-Scale Training Pipelines: Design and implement distributed training pipelines for LLMs using tools such as Fully Sharded Data Parallel (FSDP) and DeepSpeed, ensuring scalability and efficiency
LLM Customization & Fine-Tuning: Adapt LLMs for new languages, domains, and vision applications through continued pre-training, fine-tuning, and Reinforcement Learning with Human Feedback (RLHF)
Model Optimization on AWS Silicon: Optimize AI models for deployment on AWS Inferentia and Trainium, leveraging the AWS Neuron SDK and developing custom kernels for enhanced performance
Customer Collaboration: Interact with enterprise customers and foundational model providers to understand their business and technical challenges, co-developing tailored generative AI solutions
About The Team
AWS values diverse experiences.
Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply.
If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
Why AWS?
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform.
We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Inclusive Team Culture
Here at AWS, it’s in our nature to learn and be curious.
Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences.
Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (diversity) conferences, inspire us to never stop embracing our uniqueness.
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer.
That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
Work/Life Balance
We value work-life harmony.
Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture.
When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
Basic Qualifications
3+ years of non-internship professional software development experience
2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
Experience programming with at least one software programming language
Hands-on experience with deep learning and/or machine learning methods (e.g. for training, fine tuning, and inference).
Hands-on experience with generative AI technology
Preferred Qualifications
3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Bachelor's degree in computer science or equivalent
1+ years of experience hands-on experience with developing, deploying, or optimizing machine learning models using a recognized ML library or framework
Our inclusive culture empowers Amazonians to deliver the best results for our customers.
If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information.
#J-18808-Ljbffr