We are hiring a Cloud Engineer (SRE – Level 2) to join a Singapore Government-appointed agency’s Cloud Infrastructure Operations team.
This is a permanent role within a highly secure and regulated environment, focusing on commercial cloud infrastructure (AWS).
You will be part of a team responsible for maintaining, optimizing, and automating mission-critical systems that ensure reliability, security, and compliance for large-scale government applications.
The ideal candidate is hands-on with AWS services, skilled in Infrastructure-as-Code (IaC), and passionate about automation, observability, and operational excellence.
This position offers the opportunity to work on complex, multi-service cloud platforms while contributing to infrastructure innovation and resilience.
Key Responsibilities
- Operate and maintain AWS-native services across production environments (Lambda, ECS, EKS, Redshift, GuardDuty, Security Hub, KMS, WAF, and more).
- Design, implement, and maintain infrastructure automation using Terraform, CloudFormation, and Ansible.
- Manage patching and lifecycle upgrades for Linux (RHEL) and Windows Server environments using AWS Patch Manager, WSUS, and YUM/DNF.
- Monitor infrastructure health, manage alerts, and troubleshoot production incidents to ensure uptime and system stability.
- Track and renew SSL certificates, remediate end-of-life (EOL) components, and ensure compliance with internal standards.
- Document infrastructure changes, runbooks, post-mortem reports, and audit artifacts for operational transparency.
- Collaborate with cross-functional teams to enhance reliability, scalability, and observability within the cloud environment.
- Provide mentorship and guidance to junior engineers, driving best practices in automation and cloud operations.
What We’re Looking For
- 6+ years of experience in DevOps / Site Reliability Engineering (SRE) roles, with at least 4 years in public or regulated cloud environments.
- Proven hands-on experience with AWS cloud infrastructure in production.
- Strong proficiency in Infrastructure-as-Code using Terraform, CloudFormation, and Ansible.
- Solid understanding of Linux (RHEL) and Windows Server administration.
- Experience managing patching, lifecycle upgrades, and SSL certificate renewals in secure environments.
- Familiarity with monitoring, observability tools, and incident management processes.
- Excellent problem-solving, communication, and documentation skills.
Nice to Have
- AWS Certified Solutions Architect or equivalent certification.
- RHCE or Microsoft Server Administration certification.
- Experience working in government or compliance-heavy environments.
- Exposure to containerized and microservices architecture (EKS, Docker, Kubernetes).
Seniority level
Employment type
Job function
Industries
- IT Services and IT Consulting
#J-18808-Ljbffr