Site Reliability Engineer at EXASOFT PTE. LTD.

Job Overview

Company

EXASOFT PTE. LTD.

Location

Singapore

Ready to Apply?

Take the Next Step in Your Career

Join EXASOFT PTE. LTD. and advance your career in Engineering

Apply for This Position

Click the button above to apply on our website

Job Description

Job Summary:

We are seeking a Senior Site Reliability Engineer (SRE) with 10–15 years of proven experience in building, managing, and maintaining highly available, scalable, and secure infrastructure across multi-cloud and hybrid cloud environments—including on-premises data centers .

The ideal candidate will have deep knowledge of SRE principles , strong hands-on experience in automation , observability , incident response , and infrastructure resilience , and the ability to architect solutions that span cloud and traditional data center environments.

Key Responsibilities:

Design, implement, and manage reliable and scalable systems across public clouds (AWS, Azure, GCP) and on-premises data centers .
Apply SRE best practices —including SLIs, SLOs, error budgets, incident management, and postmortems —across cloud and non-cloud environments.
Develop and maintain Infrastructure as Code (IaC) using tools like Terraform, Ansible, or CloudFormation.
Drive automation for deployment, scaling, monitoring, and infrastructure management.
Implement and enhance observability practices (monitoring, logging, tracing) using tools like Prometheus, Grafana, ELK, Datadog, New Relic, etc.
Work with application teams to ensure high availability , performance , and cost optimization across hybrid environments.
Lead and participate in on-call rotations and improve overall incident response processes.
Collaborate with security and compliance teams to enforce best practices in data protection , access control, and system hardening in hybrid setups.
Evaluate and recommend emerging tools and technologies for resilience engineering , disaster recovery , and infrastructure modernization .

Required Qualifications:

10–15 years of experience in SRE, DevOps, or infrastructure engineering roles.
Proven experience managing infrastructure in multi-cloud (AWS, Azure, GCP) and hybrid cloud/on-prem environments .
Solid understanding of networking, load balancing, storage, virtualization, and container orchestration (Kubernetes, Docker).
Strong scripting and programming skills (e.g., Python, Go, Bash).
Experience with CI/CD pipelines , tools like Jenkins, GitLab CI, ArgoCD, etc.
In-depth knowledge of SRE methodologies and real-world application of SLAs, SLOs, and error budgets.
Hands-on experience with monitoring and observability stacks .
Strong analytical and troubleshooting skills for production incidents across complex, distributed systems.

#J-18808-Ljbffr

About EXASOFT PTE. LTD.

Quick Access Links

Job Details:
https://sg.expertini.com/jobs/job/site-reliability-engineer-singapore-exasoft-pte-ltd-4194-3114003/

Company Jobs:
More EXASOFT PTE. LTD. Jobs

Location Jobs:
Jobs in Singapore

Category Jobs:
Engineering Jobs

Don't Miss This Opportunity!

EXASOFT PTE. LTD. is actively hiring for this Site Reliability Engineer position

Apply Now

Site Reliability Engineer