We are looking for a highly skilled Kafka Engineer / Architect who can design, build, and operate enterprise-scale streaming platforms.
This role is hands-on and combines responsibilities across architecture, cluster installation, administration, and application development .
The ideal candidate has deep knowledge of Kafka internals, Confluent Platform components, and event-driven application design.
Key Responsibilities
Architecture & Design
- Design scalable and fault-tolerant Confluent Platform / Confluent Cloud architectures (on-prem, cloud, hybrid).
- Define data streaming strategies including topic design, partitioning, replication, retention, and compaction policies.
- Architect disaster recovery (DR), geo-replication, and high availability solutions.
- Establish security, governance, and compliance models (TLS, mTLS, SASL, RBAC, ACLs).
Installation & Management
- Install, configure, and upgrade Confluent Platform components (Brokers, Schema Registry, Connect, ksqlDB, Control Center).
- Manage Kafka clusters in bare metal, VM, Kubernetes (CFK/Strimzi), or cloud environments .
- Implement monitoring, alerting, and observability (Prometheus, Grafana, Control Center, Splunk, Datadog).
- Perform capacity planning, scaling, and performance tuning of Kafka clusters.
- Ensure data reliability and cluster stability through proactive health checks and troubleshooting.
Development & Integration
- Build producers, consumers, and stream-processing applications using Java, Python, Kafka Streams, or ksqlDB.
- Develop Kafka Connect pipelines for integrating databases, APIs, and data lakes/warehouses.
- Implement schema evolution and governance using Confluent Schema Registry (Avro, Protobuf, JSON Schema).
- Partner with developers to build event-driven microservices and optimize client-side performance.
- Establish CI/CD pipelines and automate deployments using Terraform, Ansible, Helm, or GitOps .
Operations & Support
- Provide day-to-day operational support for Kafka environments.
- Lead incident troubleshooting, root cause analysis, and recovery efforts.
- Manage security patches, upgrades, and audit compliance .
- Tune stateful streaming workloads (RocksDB, memory, JVM, network).
- Work with cross-functional teams (data, security, infra) to ensure enterprise adoption.
Required Skills & Experience
- 7+ years in software engineering or infrastructure roles, with 3–5 years focused on Kafka/Confluent .
- Strong hands-on expertise in Kafka cluster administration and application development .
- Proficiency in Confluent Platform components : Brokers, Connect, Schema Registry, ksqlDB, Control Center.
- Solid understanding of Kafka internals (ISR, partitioning, leader election, log compaction).
- Experience with security protocols : TLS, mTLS, SASL (SCRAM, Kerberos, OAuth), ACLs, RBAC.
- Strong programming skills in Java, Scala, or Python .
- Familiarity with event-driven patterns (CDC, CQRS, event sourcing).
- Experience with Kubernetes (CFK) and IaC (Terraform, Ansible, Helm).
- Knowledge of stream processing (Kafka Streams, Flink, Spark Streaming).
- Strong problem-solving skills with ability to dive into cluster issues, performance bottlenecks, and production outages .
Preferred Qualifications
- Exposure to CI/CD pipelines and DevOps practices.
Why This Role?
- You'll work end-to-end: architecting, installing, managing, and developing Kafka-based solutions.
- Direct impact on building real-time, event-driven enterprise platforms .
- Opportunity to work with cutting-edge Confluent technologies at scale.