Location: Australia & New Zealand (candidates must have valid working rights in either country)
Position Overview
We are seeking a highly skilled
Data Scientist
with strong expertise in Databricks, Azure, and AWS, specializing in
Agentic Retrieval-Augmented Generation (RAG)
and
Large Language Models (LLMs) .
The role focuses on designing and productionizing intelligent AI/ML systems with scalable, cloud-native deployments, CI/CD pipelines, and MLOps best practices.
The ideal candidate is hands-on, solution-oriented, and experienced in building and deploying advanced AI systems across multiple cloud platforms.
Key Responsibilities
Design and implement
Agentic RAG pipelines
using Databricks Vector Search, MLflow, Unity Catalog, integrated with Azure Cognitive Search and AWS OpenSearch.
Develop
agent-based workflows
using LangChain, LangGraph, LlamaIndex, and other tool-augmented reasoning frameworks.
Fine-tune, evaluate, and deploy
LLMs
(OpenAI, Anthrop ic, MosaicML, Hugging Face, Llama) for enterprise applications.
Build
CI/CD pipelines
for ML & GenAI workloads, including:
Automated build/test/deploy workflows (Azure DevOps, GitHub Actions, Jenkins, AWS CodePipeline).
MLflow model registry integration with production/staging environments.
Infrastructure-as-Code (IaC) using Terraform, Bicep, or CloudFormation for reproducible deployments.
Implement
MLOps best practices : experiment tracking, versioning, continuous evaluation, automated retraining pipelines.
Ensure
data governance, compliance, and security
for sensitive datasets across Azure and AWS.
Collaborate with engineering and product teams to integrate
ETL/ELT pipelines
in Azure Data Factory, Synapse, AWS S3, Redshift, Glue.
Deploy and monitor models with
online evaluation pipelines
(MLflow Evaluate, DeepEval, custom scorers such as faithfulness, retrieval recall).
Provide technical mentorship on
GenAI architecture, CI/CD, and production-grade LLM deployments .
Required Skills & Qualifications
Bachelor’s or Master’s degree in Data Science, Computer Science, AI/ML, or related fields (PhD optional, not mandatory).
4+ years of professional experience delivering ML/AI or data science solutions, including cloud-native deployments.
Strong expertise with the
Databricks ecosystem : Spark (PySpark/Scala), Delta Lake, Unity Catalog, MLflow, Vector Search.
Hands-on experience with
CI/CD pipelines
for ML and GenAI:
Azure DevOps, GitHub Actions, or Jenkins.
Automated testing for ML pipelines.
Model promotion workflows (dev → staging → prod).
Proficiency in Python, SQL, distributed data processing, and cloud-native ML frameworks .
Deep experience with Azure ML, Data Factory, Synapse, Data Lake
and AWS SageMaker, Glue, S3, Redshift .
Strong knowledge of
LLM orchestration frameworks
(LangChain, LangGraph, LlamaIndex).
Solid understanding of
LLM & RAG evaluation metrics
(faithfulness, token-F1, ).
Must have valid working rights in Australia or New Zealand .
Preferred Qualifications
Experience deploying
multi-agent LLM systems
in production.
Familiarity with
Infrastructure-as-Code
(Terraform, Bicep, CloudFormation) for CI/CD automation.
Hands-on experience with
containerization and orchestration
(Docker, Kubernetes, AKS, EKS).
Contributions to
open-source GenAI/LLM projects
or published research.
#J-18808-Ljbffr