About the Role
We are seeking a Senior Manager of Kubernetes Observability to provide strategic leadership for the design, standardization, and scaled execution of our enterprise observability ecosystem across Kubernetes and OpenShift platforms, including Azure Kubernetes Service (AKS) and Google Kubernetes Engine (GKE). This role is responsible for ensuring a robust, unified, and automated observability platform that enables reliability, performance, and operational excellence across all clusters and workloads in hybrid and multi-cloud environments.
As a senior technology leader, you will define the long-term vision and operating model for metrics, logging, tracing, eventing, and monitoring standards across on-prem, cloud-managed, and hosted Kubernetes platforms. You will guide multiple engineering teams to execute consistently against this strategy, ensuring full instrumentation, proactive issue detection, reduced MTTR, and improved platform stability. Through strong architectural direction, organizational alignment, and focused mentorship, you will elevate engineering maturity and ensure developers and SREs have actionable insights that accelerate innovation and support enterprise growth at scale.
Key Responsibilities
Kubernetes Observability Strategy & Operating Model
Platform Architecture, Standardization & Instrumentation
Automation, Telemetry Workflows & Adoption
Reliability, Monitoring & Operational Excellence
Leadership, Organization & Cross-Team Alignment
Required Qualifications
Desired Qualifications
Job Expectations
We are seeking a Senior Manager of Kubernetes Observability to provide strategic leadership for the design, standardization, and scaled execution of our enterprise observability ecosystem across Kubernetes and OpenShift platforms, including Azure Kubernetes Service (AKS) and Google Kubernetes Engine (GKE). This role is responsible for ensuring a robust, unified, and automated observability platform that enables reliability, performance, and operational excellence across all clusters and workloads in hybrid and multi-cloud environments.
As a senior technology leader, you will define the long-term vision and operating model for metrics, logging, tracing, eventing, and monitoring standards across on-prem, cloud-managed, and hosted Kubernetes platforms. You will guide multiple engineering teams to execute consistently against this strategy, ensuring full instrumentation, proactive issue detection, reduced MTTR, and improved platform stability. Through strong architectural direction, organizational alignment, and focused mentorship, you will elevate engineering maturity and ensure developers and SREs have actionable insights that accelerate innovation and support enterprise growth at scale.
Key Responsibilities
Kubernetes Observability Strategy & Operating Model
- Define the target-state vision and multi-year roadmap for observability across Kubernetes, OpenShift, AKS, and GKE, including metrics, logging, tracing, eventing, and alerting standards.
- Establish a unified observability operating model that ensures consistency, scalability, and reuse across on-prem, cloud-managed, and multi-cloud Kubernetes environments.
- Define success metrics and outcomes that measure observability effectiveness, reliability improvements, and reductions in MTTR across all platforms.
Platform Architecture, Standardization & Instrumentation
- Set architectural direction for enterprise observability platforms, tooling, and telemetry pipelines across Kubernetes, OpenShift, AKS, and GKE.
- Establish standardized instrumentation patterns for clusters, workloads, control planes, and platform services, ensuring complete and consistent telemetry coverage regardless of Kubernetes distribution or cloud provider.
- Drive convergence toward unified observability frameworks that abstract provider-specific differences while preserving deep platform insight.
Automation, Telemetry Workflows & Adoption
- Drive automation of observability onboarding and telemetry workflows across Kubernetes, AKS, and GKE to reduce manual effort and accelerate adoption.
- Enable self-service observability capabilities that allow developers and SREs to easily instrument, monitor, and troubleshoot workloads across cloud and on-prem clusters.
- Ensure observability is embedded by default into platform, infrastructure-as-code, and application delivery pipelines.
Reliability, Monitoring & Operational Excellence
- Enable proactive issue detection through scalable alerting frameworks, actionable dashboards, and standardized monitoring practices across all Kubernetes platforms.
- Improve reliability and performance visibility for workloads running on OpenShift, AKS, and GKE, reducing reliance on reactive troubleshooting.
- Partner with SRE and operations teams to continuously improve incident response, post-incident learning, and preventative engineering across hybrid and multi-cloud environments.
Leadership, Organization & Cross-Team Alignment
- Lead, mentor, and develop engineering leaders and teams responsible for observability platform components and services.
- Align platform, SRE, cloud, and application teams around shared observability standards and operational goals across Kubernetes, AKS, and GKE.
- Strengthen cross-team collaboration and engineering rigor to raise overall organizational maturity in observability and operations.
Required Qualifications
- 6+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
- 3+ years of management or leadership experience
- 5+ years of experience in platform engineering, reliability engineering, or observability-focused technical leadership roles, or equivalent demonstrated experience.
- 6+ years of Grafana & Splunk
- 5+ years of experience with Kubernetes observability concepts, including metrics, logging, tracing, eventing, and monitoring platforms, across OpenShift, AKS, and GKE.
Desired Qualifications
- 6+ years of people management or senior technical leadership experience guiding multiple engineering teams.
- Demonstrated success defining and scaling enterprise observability platforms across large, multi-cloud Kubernetes environments.
- Strong understanding of SRE, operational excellence, and reliability engineering practices.
- Experience driving automation and standardization to reduce MTTR and operational toil.
- Proven ability to influence across platform, infrastructure, cloud, and application teams.
- Strong executive communication skills, including the ability to articulate strategy, tradeoffs, and outcomes to senior stakeholders.
Job Expectations
- There is no Visa sponsorship available for this position.
- There is no relocation allowance available for this position
- This position requires working in one of the posted locations in a hybrid environment
Top Skills
Azure Kubernetes Service
Eventing
Google Kubernetes Engine
Grafana
Kubernetes
Logging
Metrics
Monitoring
Openshift
Splunk
Tracing
Similar Jobs at Wells Fargo
Fintech • Financial Services
As an Associate Personal Banker, you will provide exceptional customer service, assist with account openings, and promote products to help customers succeed financially.
Fintech • Financial Services
The Lead Software Engineer will oversee the design and development of scalable data platforms, monitor performance, manage workflows, and advocate for engineering best practices within the CTR space at Wells Fargo.
Top Skills:
SparkBigQueryCloud ComposerCloud DataflowCloud DataprocCloud MonitoringCloud StorageGoogle Cloud PlatformGrafanaPythonSQL
Fintech • Financial Services
Lead and develop a team to drive business growth, ensure customer satisfaction, and manage operational compliance. Responsibilities include coaching, conflict resolution, and relationship building with customers and stakeholders.
What you need to know about the Austin Tech Scene
Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.
Key Facts About Austin Tech
- Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
- Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
- Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
- Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

