We’re at the forefront of a once in a generational change in the broadband industry. Join us as we innovate, help our customers reach their potential, and connect underserved communities with unrivaled digital experiences.
This is a remote based position in US. Please note that as part of the recruitment and hiring process, there is an in-person meeting that will take place.
We are seeking a skilled and experienced Staff Cloud Platform Engineer with expertise in Kafka to join Cloud Platform team. The Staff Cloud Platform Engineer to design, deploy, operate, and optimize our Apache Kafka-based event streaming infrastructure at scale to design in Google Cloud Platform (GCP).The ideal candidate will have a strong background in DevOps practices, cloud infrastructure automation, and big data technologies. In this role you will partner closely with platform, data, and application engineering teams to ensure our Kafka clusters are reliable, performant, and secure — running natively on GCP or AWS.
Responsibilities:
Design, provision, and manage Apache Kafka clusters (self-managed on GCP/AWS or via Confluent Platform / MSK).
Configure and tune brokers, ZooKeeper/KRaft, topics, partitions, replication factors, and retention policies for high throughput and low latency.
Perform cluster upgrades, rolling restarts, and broker replacements with zero downtime.
Implement and manage Kafka Connect pipelines for data ingestion and egress across heterogeneous systems.
Administer Kafka Streams and ksqlDB deployments for real-time stream processing workloads.
Maintain Schema Registry and enforce schema governance standards across teams.
Define and track SLIs/SLOs for consumer lag, throughput, end-to-end latency, and broker health.
Design and implement cloud infrastructure using IaC – Terraform
Build automated deployment pipelines for Kafka configuration changes using GitOps workflows (ArgoCD, Flux).
Create self-service tooling and runbooks to reduce toil for development teams.
Automate topic provisioning, ACL management, and schema registration via APIs and CLI tooling.
Integrate tools like GitLab CI/CD, or Cloud Build for automated testing and deployment.
Ensure seamless integration of data pipelines with other GCP services like Big Query, Cloud Storage.
Monitor and Optimize performance, reliability, and cost of Kafka and streaming pipelines
Implement security best practices for GCP resources, including IAM policies, encryption, and network security.
Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost.
Collaborate extensively with cross functional teams to understand their requirements; educate them through documentation/trainings and improve the adoption of the platforms/tools.
Qualifications:
10+ years of overall experience in DevOps cloud engineering, or data engineering.
5+ years of experience in Kafka at production scale.
Deep expertise in Kafka internals: replication protocol, log compaction, consumer group coordination, partition leadership, and KRaft mode
Proficiency with container orchestration (Kubernetes / Helm) and deploying Kafka via Strimzi, Confluent Operator, or equivalent
Strong understanding of networking (VPC, peering, private endpoints, DNS, load balancing) in cloud environments.
Hands-on experience with Kafka Connect, Schema Registry, and at least one stream processing framework (Kafka Streams, Flink, Spark Structured Streaming).
Proficiency in Google Cloud Platform (GCP) services, including Dataflow, Pub/Sub, Kafka, Dataproc, Big Query, and Cloud Storage.
Expertise in Infrastructure as Code (IaC) tools like Terraform or Cloud Deployment Manager.
Familiarity with data orchestration tools like Apache Airflow or Cloud Composer.
Experience with CI/CD tools like Jenkins, GitLab CI/CD, or Cloud Build.
Knowledge of containerization and orchestration tools like Docker and Kubernetes.
Strong scripting skills for automation (e.g., Bash, Python).
Experience with monitoring tools like Cloud Monitoring, Prometheus, and Grafana.
Familiarity with logging tools like Cloud Logging or ELK Stack.
Strong problem-solving and analytical skills.
Excellent communication and collaboration abilities.
Ability to work in a fast-paced, agile environment.
#LI-Remote
The base pay range for this position varies based on the geographic location. More information about the pay range specific to candidate location and other factors will be shared during the recruitment process. Individual pay is determined based on location of residence and multiple factors, including job-related knowledge, skills and experience.
San Francisco Bay Area:
156,400 - 265,700 USD AnnualAll Other US Locations:
As a part of the total compensation package, this role may be eligible for a bonus. For information on our benefits click here.
Top Skills
Similar Jobs
What you need to know about the Austin Tech Scene
Key Facts About Austin Tech
- Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
- Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
- Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
- Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center


