BrightHire Logo

BrightHire

Senior Site Reliability Engineer

Reposted 25 Days Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
The Senior Site Reliability Engineer will ensure the reliability and performance of critical systems by improving observability, database performance, Kubernetes management, and CI/CD pipelines, while enhancing developer experience and infrastructure.
The summary above was generated by AI

BrightHire is a category-creating, high-growth, Series B software company with a mission to give everyone the hiring experience they deserve.

We deliver on this mission by transforming the way many of the world’s leading companies build exceptional teams. We created the Interview Intelligence category, and our clients include some of the world’s most innovative companies—Canva, OpenAI, Ramp, Hubspot—up to the Fortune 500.

Location

Remote - USA

About the Role

You will own the end-to-end reliability and performance of many of our most critical systems. Working in lockstep with Product and Engineering, you will design, build, and refine the platform that our application and AI features run on, from Kubernetes and databases through CI/CD and observability. You will focus on keeping our systems fast, reliable, and easy for developers to work with. You will work on real infrastructure that supports features people use every day—things like:

  • Continuing to improve and iterate on our observability stack that includes Kibana, Grafana, OTel, and Elastic.
  • Database performance improvements by analyzing slow and high-volume queries, tuning indexes, optimizing query patterns and timing, and recommending schema and code changes to keep QPS and latency low.
  • Kubernetes improvements and upgrades, including deploying new services, improving resource utilization, tightening security, and standardizing deployment patterns across teams.
  • Improving CI/CD pipelines for both backend and frontend services so engineers can ship quickly and safely, with clear feedback loops, fast build times, and reliable rollbacks.
  • Enhancing the local developer experience so that running and debugging the app locally feels fast, consistent, and representative of production.
  • Helping improve our CI/CD and observability for our ML pipeline and models, bringing MLOps best practices into our existing infrastructure.
What You’ll Bring
  • You have real-world experience running production systems and doing SRE, Platform, or DevOps work for web applications or APIs.
  • You are comfortable working across Kubernetes, CI/CD, databases, and backend services, and you enjoy owning problems end to end.
  • You have strong experience with Kubernetes in production environments, including cluster upgrades, workload deployments, scaling, and debugging.
  • You have experience with observability stacks (such as Elasticsearch and Kibana, Prometheus, Grafana, or similar) and can lead efforts like upgrading Kibana to new major versions and improving logs, metrics, and dashboards.
  • You have worked deeply with relational databases and SQL, know how to profile slow queries, design and tune indexes, and work with engineers to adjust query patterns, timing, and frequency to improve performance.
  • You are comfortable in at least one backend language (i.e. Python) and can read and modify application code to support infra and performance improvements.
  • You have experience improving CI/CD pipelines, including build and test speed, deployment workflows, and release strategies (such as blue/green or canary).
  • You have worked with infrastructure-as-code tools or similar patterns to manage environments in a repeatable way.
  • You think deeply about developer experience and reliability and use both metrics and empathy to guide your decisions.
  • You care about security, resiliency, and cost as integral aspects of the systems you build and manage.
  • You move fast and independently, but you know when to pull in teammates for pairing, reviews, or cross-team alignment.
About our team
  • You’ll have the opportunity to work on high-impact projects in small, autonomous squads, with the flexibility to lead initiatives or focus as an individual contributor depending on your goals and interests.
  • Our developer experience is thoughtfully designed, with fast CI (< 10 minutes), 1-click deploys, strong observability, and a clean codebase that enables you to move quickly and confidently.
  • Our culture supports sustainable, focused work with fully remote roles, regular working hours, no-meeting Wednesdays, and flexible time off to recharge when needed.
  • Our team is composed of smart, collaborative, and genuinely kind people, creating an environment where you can learn, grow, and do your best work.
Equal Employment Opportunity (EEO) Statement

Our company does not discriminate in employment on the basis of race, color, religion, sex (including pregnancy and gender identity), national origin, political affiliation, sexual orientation, marital status, disability, genetic information, age, membership in an employee organization, retaliation, parental status, military service, or other non-merit factor.

*Note to Recruiters and Placement Agencies: We do not accept unsolicited agency resumes. Please do not forward unsolicited agency resumes to our website. We will not pay fees to any third party agency or firm and will not be responsible for any agency fees associated with unsolicited resumes. Unsolicited resumes received will be considered our property.
 

Top Skills

Ci/Cd
Elasticsearch
Grafana
Kibana
Kubernetes
Prometheus
Python
SQL

Similar Jobs

9 Hours Ago
In-Office or Remote
Atlanta, GA, USA
120K-175K Annually
Senior level
120K-175K Annually
Senior level
Fintech • Gaming • Mobile • Sports • Esports
Design, implement, and monitor reliable production systems at scale. Lead incident response and post-mortems, debug critical production issues, build observability and monitoring, drive reliability best practices and SLO governance, and mentor/train engineers to improve system scalability, resilience, and security.
Top Skills: AWSAzureCrossplaneDatadogGCPGoGrafanaKubernetesNew RelicPythonRubyTerraform
15 Days Ago
Easy Apply
Remote or Hybrid
2 Locations
Easy Apply
Senior level
Senior level
Legal Tech • Real Estate • Security • Software • Cybersecurity • PropTech
The Senior Site Reliability Engineer will enhance reliability in production SaaS systems, implement AI agents, improve observability, and mentor junior engineers.
Top Skills: AksAWSAzureBashC#,.NetDatadogEksGCPGoGrafanaKubernetesLinuxOpentelemetryPrometheusPythonTerraform
5 Hours Ago
In-Office or Remote
27 Locations
153K-205K Annually
Senior level
153K-205K Annually
Senior level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Design, maintain, and secure cloud infrastructure and CI/CD pipelines; automate operations with Go/Python; manage Kubernetes and blockchain nodes; implement disaster recovery; use AI tools for monitoring, anomaly detection, and capacity planning; participate in on-call rotations; mentor team members to improve reliability and performance.
Top Skills: Go,Python,Shell,Terraform,Crossplane,Aws Lambda,Kubernetes,Helm,Ethereum,Solana,Arbitrum,Base,Avalanche,Postgresql,Redis,Opensearch,Apache Airflow,Aws Dms,Snowflake,Github Copilot,Gemini,Chatgpt,Llms,Apm,Rum,Telemetry

What you need to know about the Austin Tech Scene

Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.

Key Facts About Austin Tech

  • Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
  • Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
  • Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
  • Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account