Principal Site Reliability Engineer
About H-E-B H-E-B is one of the largest, independently owned food retailers in the nation operating over 400 stores throughout Texas and Mexico, with annual sales generating over $25 billion. Described by industry experts as a daring innovator and smart competitor, H-E-B has led the way with creative new concepts, outstanding service and a commitment to diversity in our workforce, workplace and marketplace. H-E-B offers a wealth of career opportunities to our 109,000+ Partners (employees), competitive compensation and benefits program and comprehensive training that lead to successful careers.
H-E-B Digital is seeking new team members (Partners)! Since our inception, we’ve been investing heavily in our customers’ digital experience, reinventing how they find inspiration from food, how they make food decisions, and how they ultimately get food into their homes. This is an exciting time to join H-E-B Digital, and we’re hiring across the stack: front-end web and mobile, full-stack, and backend engineering. We’re using the best available technologies to deliver modern, engaging, reliable, and scalable experiences to meet the needs of our growing audience. Our digital solutions are growing in popularity and adoption—like Curbside and Home Delivery—so you’ll get the opportunity to define the user experience for millions of customers and hundreds of thousands of Partners. If you’re someone who enjoys taking on new challenges, working in a rapidly changing environment, learning new skills, and applying it all to solve large and impactful business problems, we want you as part of our team.
Our Partners thrive The H-E-B Way. In the Principal Site Reliability Engineer job, that means you have a…
HEART FOR PEOPLE… you have a passion for mentorship and guidance, and love for the direct person-to-person interactions that create strong bonds between teams
HEAD FOR BUSINESS… you have an ownership mentality and a consistent track record of timely delivery of high-quality software
PASSION FOR RESULTS… the ability to guide the discussion, remove roadblocks, and provide guardrails for your team as they identify challenges and propose solutions
What you’ll do at HEB:
We make capable the successful operation, secure modification, and agile creation of large-scale fault tolerant systems which delight our customers beyond expectation...which is easier said than done.
As a Principal Site Reliability Engineer your job is to join in that mission with a squad of other tenacious engineers to ensure the world-class performance, efficiency, change management, monitoring, capacity planning and emergency response policies of our software, infrastructure, and it's dependencies. Your goal ultimately is to engineer operationally efficient & performant solutions, increase system observability, minimize human interactions with production systems, accelerating customer value delivery, and to help evangelize those best practices to others.
We enable the reliability that makes building fantastic HEB Digital products possible -- and we are incredibly proud of that.
Who You Are
- You have an ownership mentality and a consistent track record of successful, high-quality results.
- You have 7+ years of experience in software and infrastructure
- Interest in designing, building, analyzing, operating, and troubleshooting distributed cloud systems.
- Systematic problem-solving approach, coupled with strong interpersonal skills and intrinsic motivation to get things accomplished well.
- Ability to debug, instrument, and optimize code, describe system performance characteristics, and automate routine tasks.
- Experience working with large data sets, ETL, streaming, eventing & messaging, & multi-cloud environments.
- Deep understanding of algorithms, data structures, automation, complexity analysis, and software design
- Experience with layer 3 networking, routing tier, network security, latency optimizations, and appliance abstractions like WAF.
- Experience with layer 1 linux operating systems, optimization performance tuning, distributed file systems, databases, container orchestration, & distributed tracing.
What You'll Do
- Coach and mentor junior and senior engineers in engineering techniques, processes, and new technologies; enable others to succeed.
- Improve observability pipeline and establish baseline capabilities for service level indicators
- Engage in and improve the software delivery lifecycle from establishing acceptance criteria through deployment, operation, and refinement.
- Ensure success of new services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and readiness reviews.
- Engage and nurture other squads to be capable of maintaining services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through automation, agile improvement, and dynamic resource utilization.
- Practice sustainable incident management and foster a blameless retrospective culture.
- H-E-B is one of the largest, independently owned food retailers in the nation, operating over 400 stores throughout Texas and Mexico, with annual sales generating over $26 billion
- We hire talented people (116,000+ Partners), and give them autonomy to be creative in how they impact the business
- We’re a Partner-driven company with a Bold Promise – Because People Matter
- We embrace Diversity and Inclusion as core values, and support them with thriving company-wide programs
- We’re a truly original Texas-based company that created the Spirit of Giving to help Texas communities every day
- Once eligible, our Partners become Owners in the company. “Partner-owned” means our most important resources—People—drive the innovation, growth, and success that make H-E-B The Greatest Retailing Company.