Key Responsibilities
Platform & Backend Development: Design backend services (Python, FastAPI, gRPC) to support agent workflows, computer vision pipelines, and evaluation loops.
Build scalable APIs for orchestration, task management, vector search, and model serving.
Infrastructure & Deployment: Own CI/CD pipelines (GitHub Actions, Terraform) and production deployments.
Develop infrastructure for memory stores, compute orchestration, and model packaging (Docker, TorchServe, BentoML).
Engineering Excellence: Establish quality practices including testing (Pytest), monitoring, and observability (Prometheus/Grafana).
Ensure fault-tolerant, modular, and scalable system design.
Collaboration & Leadership: Mentor peers through code reviews, documentation, and clean architecture.
Lead system design discussions and integration with AI and platform teams.
Must-Have Skills
6+ years of software engineering, including 2+ in AI/ML environments.
Proficient in Python and production-grade API development (FastAPI, Flask, gRPC).
Experience with CI/CD and infrastructure-as-code (GitHub Actions, Terraform).
Skilled in containerization (Docker, Kubernetes) and cloud platforms (AWS, GCP, or Azure).
Familiarity with databases: SQL, NoSQL, and vector DBs (FAISS, Weaviate, pgvector).
Understanding of ML lifecycles: data ingestion, inference, monitoring, and recovery.
Proven ability to design distributed systems (API gateways, data pipelines, compute orchestration).
Bonus Skills
Familiarity with AI agent frameworks (LangChain, AutoGen, CrewAI).
Understanding of computer vision concepts and deployment challenges.
Exposure to LLM APIs or GenAI integrations.
Experience with ML observability and error logging systems.
Knowledge of front-end prototyping tools (Gradio, Streamlit, etc.).
What We Offer
Small, agile team (5–6 engineers + interns) with autonomy and real ownership.
Startup feel with a big company resources: International environment where the majority of the team and leadership is from startups or big international corporations (Lazada, Gojek, IBM) and from various countries.
Low-bureaucracy, high-impact startup environment where your code directly supports next-gen AI deployment.
Experimentation and self-development are in our culture.
Knowledge sharing and collaboration.
Direct collaboration with top AI researchers and computer vision scientists.
Hybrid work setup: ~2–3 days in office per week.
#J-18808-Ljbffr