Key Responsibilities
Design and implement
hybrid cloud observability and monitoring
solutions across multiple environments.
Develop and manage
alerting systems, metrics, and dashboards
for proactive issue detection.
Integrate
logging pipelines
for structured and unstructured data sources.
Implement
log archiving
strategies (e.g., S3) for compliance and cost optimization.
Perform
advanced log analysis
and correlation to support root cause investigation and performance tuning.
Collaborate with infrastructure and development teams to define and track
SLIs, SLOs, and SLAs .
Required Skills
Strong experience with
Splunk ,
Prometheus ,
Grafana , and
Amazon CloudWatch .
Proficiency with
ELK/EFK stacks
(Elasticsearch, Logstash/Fluentd, Kibana).
Hands-on experience creating
custom dashboards , alerts, and metrics visualizations.
Experience building and managing
centralized logging pipelines
across distributed systems.
Familiarity with
S3 log archiving
and
multi-environment data integrations .
#J-18808-Ljbffr