5 days ago Be among the first 25 applicants
Get AI-powered advice on this job and more exclusive features.
Level
Entry to Mid Level
(PhD Required)
Bridge Cutting-Edge AI Research with Petabyte-Scale Data Systems
Pixalate is an online trust and safety platform that protects businesses, consumers and children from deceptive, fraudulent and non-compliant mobile, CTV apps and websites.
We are seeking a PhD-level Big Data Engineer to revolutionize how AI transforms massive-scale data operations.
Our impact is real and measurable.
Our software has uncovered:
Gizmodo: An iCloud Feature Is Enabling a $65 Million Scam
Washington Post: Your kids' apps are spying on them
ProPublica: Porn, Piracy, Fraud: What Lurks Inside Google's Black Box Ad Empire
About The Role
Work at the intersection of big data and AI, where you'll develop intelligent, self-healing data systems processing trillions of data points daily.
You'll have autonomy to pursue research in distributed ML systems and AI-enhanced data optimization, with your innovations deployed at unprecedented scale within months, not years.
This isn't traditional data engineering - you'll implement agentic AI for autonomous pipeline management, leverage LLMs for data quality assurance, and create ML-optimized architectures that redefine what's possible at petabyte scale.
Key Research Areas & Responsibilities
AI-Enhanced Data Infrastructure
Design intelligent pipelines with autonomous optimization and self-healing capabilities using agentic AI
Implement ML-driven anomaly detection for terabyte-scale datasets
Distributed Machine Learning at Scale
Build distributed ML pipelines
Develop real-time feature stores for billions of transactions
Optimize feature engineering with AutoML and neural architecture search
Required Qualifications
Education & Research
PhD in Computer Science, Data Science, or Distributed Systems (exceptional Master's with research experience considered)
Published research or expertise in distributed computing, ML infrastructure, or stream processing
Technical Expertise
Core Languages: Expert SQL (window functions, CTEs), Python (Pandas, Polars, PyArrow), Scala/Java
Neural Architecture Search: KerasTuner, AutoKeras, Ray Tune, Optuna, PyTorch Lightning + Hydra
Research Skills
Track record with 100TB+ datasets
Experience with lakehouse architectures, streaming ML, and graph processing at scale
Understanding of distributed systems theory and ML algorithm implementation
Preferred Qualifications
Experience applying LLMs to data engineering challenges
Ability to translate complex AutoML/NAS research into practical production workflows
Hands-on project examples of feature engineering automation or NAS experiments
Proven success in automating ML pipelines, from raw data to an optimized model architecture
Contributions to Apache projects (Spark, Flink, Kafka)
Knowledge of privacy-preserving techniques and data mesh architectures
What Makes This Role Unique
You'll work with one of the few truly petabyte-scale production datasets outside of major tech companies, with the freedom to experiment with cutting-edge approaches.
Unlike traditional big data roles, you'll apply the latest AI research to fundamental data challenges - from using LLMs to understand data quality issues to implementing agentic systems that autonomously optimize and heal data pipelines.
At Pixalate, you will have the opportunity to work on pioneering technologies alongside some of the brightest minds in the industry.
If you're passionate about maintaining high software quality and thrive in a fast-paced, challenging environment, you'll fit right in.
Benefits
Monthly internet reimbursement
Casual, remote work environment
Hybrid, flexible hours
Opportunity for advancement
Being part of a high performing team that wants to win and have fun doing it
Seniority level
Entry level
Employment type
Full-time
Job function
Engineering, Information Technology, and Research
Industries: Technology, Information and Internet, Software Development, and Advertising Services
Referrals increase your chances of interviewing at Pixalate by 2x
Get notified about new Big Data Developer jobs in
Singapore, Singapore .
Related roles
Analytics Engineer at Aios Medical — Remote, $90k-$150k/yr inc equity
Python and Kubernetes Software Engineer - Data, AI/ML & Analytics
Software Engineer, Data Infrastructure & Acquisition - Asia
Python and Kubernetes Software Engineer - Data, Workflows, AI/ML & Analytics
Freelance Software Developer (Python-Rust) - AI Trainer
Senior Software Developer Go/Python (Acronis RMM)
Senior AI Engineer - Data & Infrastructure for Multimodal Models (100% Remote)
Senior Software Engineer - Data Orchestration
Expert Software Engineer - Data Orchestration
Senior Software Engineer - Backend - Optimize and Analytics Integration
We’re unlocking community knowledge in a new way.
Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr