AVP, SRE Observability Platform Engineer, SRE & Governance, Group Technology 
Join to apply for the 
AVP, SRE Observability Platform Engineer, SRE & Governance, Group Technology 
role at 
DBS Bank .
Overview 
Group Technology enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation.
In Group Tech, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels.
Job Objective 
DBS Bank is looking for a Platform SRE Observability Engineer with experience working on enterprise level data engineering, analytics, and observability applications.
The SRE engineer would be responsible for ensuring high availability of the platform services and perform continuous improvements to increase the platform’s efficiency and resiliency.
The SRE engineer will also perform automation development tasks to remove toil and increase the team’s productivity.
Roles and Responsibilities 
Develop monitoring and onboarding guidelines for various applications using observability platform stack, ensuring accurate monitoring and data collection 
Implement Observability standards, best practices, operations and processes for the Enterprise in AppDynamics & other observability tools 
Automate routine tasks and reporting processes using APIs and scripting, reducing manual effort and improving efficiency in AppDynamics & other observability tools 
Identify and resolve performance issues through detailed analysis of transaction traces, application logs, and system metrics 
Collaborate with stakeholders to define performance metrics and monitoring requirements aligned with business goals 
Contribute to internal knowledge bases, create documentation, and share insights with the team to promote a culture of learning and collaboration 
Design and implement monitoring solutions to track application performance, identifying bottlenecks, capacity planning and optimising system efficiency 
Develop custom dashboards and reports to provide actionable insights and drive decision-making processes 
Collaborate with development and operations teams to integrate Observability platform stack with CI/CD pipelines and other DevOps tools 
Configure and fine-tune alerts to proactively detect and address performance issues before they impact end-users 
Continuously review and enhance monitoring processes and methodologies to improve efficiency and effectiveness 
Work with application teams to develop long-term monitoring strategies that align with business goals and technology roadmaps 
Create data retention polices and access controls (RBAC) to manage user permissions 
Perform application maintenance, patching, upgrading controller versions, agents etc and ensure EOS/EOL is maintained 
Deliverables 
Ensure on-time delivery of tasks and projects 
Ensure continuous uptime of applications and services 
Ensure no security or audit issues 
Requirements 
Comply to bank standards to track and follow up on the assigned projects 
Cover all areas in application and infrastructure operations of the platform 
Education and Relevant Experience 
You should be a university graduate (computer science or related field) with good experience working with contemporary technologies and scripting languages 
Strong communication skills and ability to explain protocol and processes with team and management 
A passion for learning and using new technologies in the open-source communities 
A passion for coding 
Functional / Technical Competencies 
Min 7 years of IT work experience 
Working knowledge in AppDynamics, ELK Stack, Grafana, Open Telemetry (OTEL)
In-depth experience in Unix/Linux/Shell/Python scripting with quality, scalability, and extensibility 
Experience in triaging and troubleshooting application problems quickly in monitoring tools by using various techniques - Transaction snapshots, Diagnostic Sessions, Data Collectors 
Knowledgeable and experienced in SRE (Site Reliability Engineering) practices covering monitoring, observability, performance management, automation, and resiliency 
Knowledge in Confluent Kafka, Prometheus & other APM tools (Dynatrace, Datadog, New Relic, Splunk) is a plus 
Knowledge in AI/ML capabilities to automate RCA’s and shorter MTTR when issues arise 
Good understanding of Network routing, Load balancing and Networking protocols; a base knowledge of TCP/IP, with an understanding of  and DNS 
Ability to contribute to discussions on design and strategy 
Good problem diagnosis and creative problem-solving skills 
Experience in automation tools and CICD – Jenkins, Ansible 
Apply 
We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.
Job Details 
Primary Location: Singapore-DBS Asia Hub 
Job: Technology 
Schedule: Regular 
Job Type: Full-time 
Job Posting: Aug 12, 2025, 8:00:00 AM 
Seniority level: Not Applicable 
Employment type: Full-time 
Job function: Information Technology 
Industries: Banking, Financial Services, and Investment Banking 
#J-18808-Ljbffr