Base is America’s next-generation power company. We’re rebuilding the foundation of modern civilization–electricity–by deploying a vast network of distributed batteries that is transforming today’s fragile, centralized grid into a resilient and abundant system. We are engineers, operators, and creatives solving some of the most complex, interdisciplinary challenges of our time.
About the RoleYou will define and own how Base runs production systems.
As the first Site Reliability Engineer at Base, you will design the technical and operational model for running our services across cloud infrastructure and hardware-connected systems with real-world constraints. This role sits at the boundary between software, infrastructure, and physical systems, where failures are observable, impactful, and sometimes irreversible.
This is not a DevOps role.
What You’ll Do
Build and operate services and infrastructure that directly improve the reliability of Base’s production systems.
Design and own Base’s observability and operational tooling, including metrics, logs, traces, alerting, and the internal workflows engineers use to understand and operate live systems.
Define alerting and on-call standards so pages are actionable, correctly scoped, and tied to real user or system impact.
Establish and evolve how Base responds to reliability incidents, including response coordination, root cause analysis practices, and driving permanent fixes at the code, infrastructure, and architecture levels.
Own access control and auditability for production systems, including role-based access, permission boundaries, and security logging.
What You'll Bring
Strong coding ability in modern languages; we use Go, but value engineers who can learn and adapt quickly.
Experience deploying and operating infrastructure defined through infrastructure as code (e.g., Terraform, Pulumi).
Experience designing shared infrastructure and standards that enable teams to operate autonomously, without centralizing ownership or creating bottlenecks.
Experience owning and operating production systems with meaningful uptime and reliability requirements.
A deep understanding of reliability fundamentals, including failure modes, redundancy, load, and capacity.
Hands-on experience building highly observable systems, with clear opinions on metrics, logs, traces, and alerting.
Experience diagnosing failures in live systems by grounding decisions in production data rather than intuition.
Sound judgment when balancing reliability, development velocity, and system complexity.
Comfort taking end-to-end responsibility for systems in a fast-growing, ambiguous environment.
First Principles Thinking: Question assumptions. Principles > rules.
Operate at Base Pace: Focus on what matters, act quickly, and learn by doing.
Give & Get Feedback: Be direct, be humble, and maintain a growth mindset.
Everyone’s an Owner: Follow through on commitments and own results.
Strong Opinions, Loosely Held: Drive clarity and make calls with imperfect information.
Committed to the Mission: Rebuilding the grid is a big challenge. We work hard because we care deeply about the impact we’re creating. We work in-person. It’s not a 9-to-5. We are all-in.
Fun & Optimism Coexist with Grit: Collaboration and celebration coincide with the intensity of building real things.
Top Skills
Base Power Company Austin, Texas, USA Office
205 E Riverside Drive, Austin, TX, United States, 78704
Similar Jobs
What you need to know about the Austin Tech Scene
Key Facts About Austin Tech
- Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
- Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
- Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
- Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center


