We are hiring Platform Engineer – Big Data (Hadoop / Spark / Hive) with below requirements;
Responsibilities
- Design, implement, and manage big data platforms built on Hadoop, Spark, and Hive.
- Develop and optimize data ingestion, transformation, and storage frameworks for high-volume, high-velocity datasets.
- Maintain and fine-tune Hadoop clusters (HDFS, YARN) including capacity planning, monitoring, and troubleshooting.
- Build and support Hive-based data warehouses for analytics, including schema design, partitioning, and performance optimization.
- Optimize Spark jobs for batch and streaming use cases (memory management, shuffle tuning, query optimization).
- Implement data security and governance policies (Kerberos, Ranger, Atlas) to ensure compliance and controlled access.
- Collaborate with data engineers, architects, and analysts to deliver robust and efficient platform capabilities.
- Drive automation in cluster operations, deployments, and monitoring using scripting and DevOps tools.
Skills
- Experience in data engineering or platform engineering with strong focus on big data ecosystems.
- Deep expertise in Hadoop (HDFS, YARN) cluster management and optimization.
- Strong hands-on experience in Spark (batch + streaming) and Hive data modeling.
- Experience in performance tuning, query optimization, and cluster troubleshooting.
- Familiarity with distributed system internals, resource management, and fault tolerance.
- Knowledge of cloud big data platforms (AWS EMR, Azure HDInsight, GCP Dataproc) is a plus.
- Exposure to automation frameworks (Ansible, Terraform, Airflow, CI/CD pipelines) is preferred.