Responsibilities:
- Meet stringent SLA requirements, ensuring zero service level breaches and achieve 100% compliance with policy and procedure.
- Work with cross-functional teams, including developers and DevOps, to deliver cloud solutions.
- Developed Azure Consistent Storage (ACS) and App Service on Azure Stack Hub hybrid cloud computing solution.
- Configuration of underlying infrastructure and services within the Azure Stack Hub using tools like Azure PowerShell and client libraries to interact with these services.
- Analyse problems, perform troubleshooting, and track problems through resolution.
- Develop troubleshooting utility tools and log analysis dashboard, write troubleshooting guides for cloud engineers' reference.
- Responsible for monitoring activities, such as security and bandwidth usage
- When necessary, escalate problems to the appropriate stakeholders / principals
- Ensure generation and management of support tickets according to SLA requirements
- Schedule and run periodic maintenance for security infrastructure at the data centre
- Plan and manage change request and / or maintenance activities for security operations including firmware patches
- Perform Change Request, Service Request as assigned by team lead according to the SLA
- Manage project stakeholders, including vendors and principals, to align on technical deliverables and schedules.
- Prepare project documentation such as implementation plans, risk assessments, and post-project reviews.
Provide 24 by 7 offsite standby after office hours.
Requirements:
- Master's (preferably) or Bachelor's Degree in Computer Science, Information Technology or Engineering studies.
- At least 10 years in IT Systems and Server design and administration, capable of deploying and maintaining Windows Server and Linux virtual machines in enterprise environments to achieve network interoperability, permission management, and system optimization.
- At least 5 to 8 years of experience in cloud operations, focusing on cloud infrastructure setup and maintenance across platforms like AWS, Azure, or GCP.
- At least 5 years in managing Microsoft Azure Stack Hub hybrid cloud computing solution, with strong understanding the differences and similarities between Azure and Azure Stack Hub, and focusing on capacity planning and integration with existing tenant infrastructure.
- Minimum 3 year of experience in cloud security administration.
- Must have relevant cloud and IT systems professional certifications such as Azure Stack Hub Operator Associate (Az600), Azure Administrator Associate (Az104), RHCE, VCP6 and etc.
ITIL certification will be an advantage
Must be able to provide 24 by 7 offsite standby after office hours, and committed to 60 minutes response time once being activated.
Special Knowledge or Skill:
- Expertise in Linux (RHEL), networking, infrastructure-as-code (Terraform, Ansible), containerization (Docker, Kubernetes), CI/CD, monitoring (Grafana, Elasticsearch), and cloud security.
- Proficiency in scripting languages such as Python or Bash for automation.
- Develop troubleshooting commands and scripts with Powershell
- Develop troubleshooting dashboards with Kusto Query Language (KQL) / AzureDataExplorer (ADX) and PowerBI.
- Azure Networking – VNet, VPN Gateway, ExpressRoute, L4/L7 Load Balancing.
- Microsoft SDN (Software Defined Networking) private cloud networking.
- Experience in systems administration and task automation (Windows server, Linux servers and virtualizations).
- Good Knowledge of following products will be advantageous: Beyond Trust, SEPM, RSA, Palo Alto, Checkpoint, Fortigate and Safenet
- Able to handle demanding service response and recovery turnaround
- Meticulous and process-oriented
- Good, hard-working attitude with ability to work well under pressure
- Good communication skills English (both written and spoken)
- Good analytical skills with ability to work with others to resolve problems
- Good organization skills, with ability to properly document and track information.