Site Reliability Engineer (Linux/Kernel)
Job Description:
We are looking for askilled SiteReliability Engineer to join our client's global SRE Team in Singapore.
Responsibilities:
Overseeing and ensuring the continuous operation of the firm's Linux based trading infrastructure, addressing day to day operational needs
Providing second level support, including:
Rapid response to emergencies
Implementing scheduled updates and deployments
In depth analysis and resolution of performance issues
Engage in a rotational on call schedule, including early morning and weekend shifts, to provide timely support
Contributing towards the development of automated solutions for server provisioning, configuration, and monitoring, targeting a scalable management of thousands of servers
Engaging in interactions with the Trading and Core Engineering teams
Managing essential Core services such as DHCP, LDAP, DNS, and NFS for on prem and hosted data centers as well as public clouds
Participating in an on call rotation and occasional weekend shifts
Qualifications:
Sound expertise in Linux production environments
Basic knowledge of Python and Bash scripting
Engagement with automation and monitoring tool sets
Comprehensive knowledge of operating system principles, with a particular focus on Linux internals
Familiarity with Intel based server hardware and components
Competence in server side networking, including understanding network protocols and configurations
Familiarity in cloud services and architectural solutions
Experience in designing, building, and troubleshooting complex systems
Good problem solving skills, underpinned by a methodical approach to technical challenges.
This includes an ability to communicate effectively, demonstrating strong interpersonal skills, a sense of responsibility, and a commitment to driving projects to completion.
#J-18808-Ljbffr