Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Austin, TX
Artificial Intelligence • Insurance • Software • Automation
The Staff Site Reliability Engineer will build and scale infrastructure for Assured's platform, automate delivery, enhance observability, and lead mentoring initiatives.
Top Skills:
AWSKubernetesPostgresTerraform
Artificial Intelligence • Other • Sales • Software
The role involves designing and advancing infrastructure for the engineering team, ensuring the reliability of Kubernetes clusters, automating operations, and building machine learning infrastructure.
Top Skills:
ArgoAWSAzureCloudFormationFluxGithub ActionsGoGCPKubernetesPostgresPythonTerraform
Agency • Information Technology
Lead SRE role designing and maintaining CI/CD pipelines (GitHub Actions), containerized deployments (Docker, Kubernetes, AKS, Helm), web/mobile app releases, observability, automated testing, and DevOps best practices across cloud environments with cross-functional collaboration and regulatory compliance.
Top Skills:
AksAndroidAzure Application InsightsAzure Log AnalyticsAzure MonitorBashBranchingDockerDocker ComposeGitGit HooksGithub ActionsGoogle PlayHelmHerokuiOSIos App StoreJavaKubernetesNpmPowershellPull RequestsPythonSonarqubeVeracodeVercel
Internet of Things • Software • Manufacturing
Lead and oversee cloud operations and Site Reliability Engineering for a global IoT ecosystem, architecting strategies for performance, security, and innovation while mentoring a team of professionals in multi-cloud environments.
Top Skills:
AnsibleAzureCi/CdCloudElkGrafanaIotKubernetesPrometheusSreTerraform
Digital Media • Social Media • Software • Sports
Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.
Top Skills:
Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform
Computer Vision • Machine Learning • Software
As a Site Reliability Engineer, ensure the reliability, performance, and scalability of Ditto's cloud infrastructure by developing observability solutions, leading incident management, and collaborating with product engineering teams.
Top Skills:
AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform
Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence
Lead architecture and implementation of reliability platforms and SRE practices for a production SaaS. Build self-service reliability tooling, drive AIOps automation, advance observability (monitoring, tracing, profiling), lead incident response and postmortems, mentor engineers, and embed production readiness across teams to achieve 99.99% uptime.
Top Skills:
AWSAzureContinuous ProfilingDatadogDnsElkGCPGoGrafanaHttp/SKubernetesLoad BalancingOpentelemetryPrometheusPythonTcp/Ip
Other
As a Site Reliability Engineer, you will design cloud platforms, automate operations, maintain infrastructure, and support engineering teams in delivering reliable services.
Top Skills:
AnsibleAWSAzureBashCircleCICloudFormationDatadogDnsDockerGitlab CiGoGCPGrafanaHTTPHttpsJenkinsKubernetesKvmLinuxPerlPrometheusPythonRubyTcp/IpTerraformUnixVMware
Healthtech • Other • Software
As a Senior Database Site Reliability Engineer, you'll design, implement, and maintain PostgreSQL systems, ensure reliability, automate maintenance tasks, and participate in incident response.
Top Skills:
AnsibleBashDatadogGrafanaNew RelicPostgresPowershellPrometheusPythonTerraform
Software • Financial Services
Ensure platform reliability, performance, and availability by implementing observability, automating infrastructure, participating in on-call rotations and post-mortems, partnering with Product and Engineering, designing scalable architectures, mentoring teammates, and integrating Dynatrace with Azure DevOps and Jira while supporting compliance (SOC/FedRAMP).
Top Skills:
.NetAksAlpineAnsibleAppinsightsArm TemplatesAWSAzure DevopsBashBicepC#ChefCloudFormationDatadogDebianDynatraceEksGCPGitGitGksGrafanaHelmJIRAKubernetesLog AnalyticsAzureNew RelicOnestream SoftwareOpenshiftPowershellPowershell DscPrometheusPuppetPythonRest ApisSQLTerraformUbuntu
Fintech • Information Technology
As a Site Reliability Engineer at Alpaca, you will ensure system reliability and performance, troubleshoot issues, and collaborate with teams to design scalable features.
Top Skills:
GoGormLinuxPgxPostgresPrometheusSqlc
Gaming • Software
The Site Reliability Engineer will manage infrastructure stability and scalability, lead cloud migrations, and optimize performance across systems while mentoring team members.
Top Skills:
AnsibleAWSAzureBashChefCloudFormationDatadogDockerElk StackGCPGoGrafanaKubernetesPrometheusPuppetPythonTerraformUnix/Linux
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Artificial Intelligence • Cloud • Information Technology • Software • Big Data Analytics
Founding Staff SRE for Volcano: define SLOs/error budgets, architect multi-region Kubernetes infrastructure, build GitOps/CI-CD with ArgoCD/Helm/Terraform, scale managed Postgres/Redis/object storage, implement observability with Datadog/Prometheus/Grafana, lead incident response and SRE culture, and mentor cross-functional teams.
Top Skills:
ArgocdCanary DeploymentsCi/CdCniDatadogGitopsGrafanaHelmIngressKubernetesObject StoragePostgresPrometheusRedisService MeshTerraformTerragrunt
Software
As a Site Reliability Engineer, you'll enhance system reliability, collaborate on production readiness, define SLIs/SLOs, and improve incident response.
Top Skills:
AWSDatadogGrafanaKubernetesOpentelemetryPrometheusTypescript
Reposted 2 Hours AgoSaved
Financial Services
Own reliability and scalability of on-prem observability platforms (ELK, Grafana); handle production escalations, capacity planning, SLOs, onboarding, automation, IaC (Terraform/Helm/Ansible), upgrades, security hardening, and platform modernization.
Top Skills:
AnsibleApm InstrumentationBashBeatsChefElasticsearchElk StackFluent BitFluentdGrafanaHelmKibanaLinuxLogstashNew RelicOpentelemetryPrometheusPuppetPythonShell Scripting/Linux ShellSolarwindsTerraform
Healthtech • Software
The SRE Technical Project Manager will lead project delivery, incident management, automation processes, and uptime communication, partnering with SRE and development teams to ensure system stability and scalability.
Top Skills:
Ai BotsDatadogJIRAJira Service ManagementMs TeamsOpsgeniePagerduty
Real Estate • Financial Services • PropTech
Support and optimize products migrated to AWS, implement cloud best practices, maintain operational coverage, enhance automation, observability, CI/CD/GitOps, and security. Collaborate with development and platform teams to scale, troubleshoot, and ensure reliable SaaS operations.
Top Skills:
AmisArgocdAWSAws Elastic BeanstalkAws Transfer FamilyAzure DevopsBashCloudwatchCurlDockerEc2EksFluxcdGitGitopsHTTPIstioKubernetesLinkerdLoad BalancerPowershellPythonRdsSQLTerraformWget
Blockchain
The Blockchain Site Reliability Engineer is responsible for maintaining blockchain nodes' reliability, monitoring, incident response, and building automation tools to enhance operations.
Top Skills:
DockerElkGoGrafanaJavaScriptKubernetesLinuxPrometheusPythonRustShell
eCommerce
Ensure reliability and availability of Tradeweb's global AWS platform through IaC automation, observability and SLO definition, incident triage and resolution, on-call duties, collaboration with development teams, and security-focused platform improvements.
Top Skills:
ArgocdAWSAws LambdaEksGitsecopsInfrastructure As Code (Iac)Kubernetes (K8S)KustomizeLgtmLinux/UnixPulumiPythonSmsSns
Reposted 3 Days AgoSaved
Financial Services
The Senior Site Reliability Engineer will own the operational reliability of developer tooling ecosystems and improve developer productivity through efficient processes and automation.
Top Skills:
.NetBashPowershellPython
Cloud • Software
The Senior Site Reliability Engineer will automate operations using Python, manage Kubernetes and OpenStack clusters, and ensure high availability for enterprise infrastructures.
Top Skills:
KubernetesLinuxOpenstackPython
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Lead SRE work to keep Circle highly available and performant: respond to incidents, own monitoring/alerting/log management, manage and optimize MySQL/Postgres/ClickHouse/Redis databases, maintain server infrastructure and deployment pipelines, collaborate with engineering teams, and build internal SRE tooling and automation.
Top Skills:
AWSClickhouseKubernetesLlm-Based Tools (Copilots)MySQLPostgresRedis
Information Technology • Security
The Staff Site Reliability Engineer will lead the architecture and security of the SimSpace cyber range platform, focusing on reliability, automation, and observability across diverse deployment environments while mentoring engineers and driving infrastructure initiatives.
Top Skills:
ArgocdGithub ActionsGoGrafana TankaJsonnetKubernetesPython
Artificial Intelligence • Cloud • Information Technology • Software
As a Staff SRE, you will ensure the reliability and performance of Andromeda's GPU infrastructure, lead incident responses, build observability systems, and mentor engineers, while collaborating closely with engineering and customers.
Top Skills:
AnsibleCudaGoHelmKubernetesLinuxNcclNvidiaPythonRustSlurmTerraform
Cloud • Software • Analytics
Join Arista Networks as a Site Reliability Engineer to manage CloudVision service reliability, scalability, and stability in a FedRAMP environment, focusing on areas like architecture, security, and performance optimization.
Top Skills:
AnsibleBashGCPGkeGoKubernetesPulumiPython
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Popular Job Searches
All Software Engineer Jobs in Austin
.NET Developer Jobs in Austin
Android Developer Jobs in Austin
C# Jobs in Austin
C++ Jobs in Austin
DevOps Jobs in Austin
Engineering Manager Jobs in Austin
Front-End Developer Jobs in Austin
Golang Jobs in Austin
Hardware Engineer Jobs in Austin
iOS Developer Jobs in Austin
Java Developer Jobs in Austin
Javascript Jobs in Austin
Linux Jobs in Austin
Perl Jobs in Austin
PHP Developer Jobs in Austin
Python Jobs in Austin
QA Engineer Jobs in Austin
Ruby Jobs in Austin
Sales Engineer Jobs in Austin
Salesforce Developer Jobs in Austin
Scala Jobs in Austin
Backend Engineer Jobs in Austin
Devops Engineer Jobs in Austin
Engineering Jobs in Austin
Field Engineer Jobs in Austin
Full-Stack Engineer Jobs in Austin
Infrastructure Engineer Jobs in Austin
Principal Software Engineer Jobs in Austin
Senior Android Engineer Jobs in Austin
Senior Front-End Engineer Jobs in Austin
Senior Full-Stack Engineer Jobs in Austin
Senior Ios Engineer Jobs in Austin
Senior Site Reliability Engineer Jobs in Austin
Senior Systems Engineer Jobs in Austin
Software Engineering Manager Jobs in Austin
Software Test Engineer Jobs in Austin
Solutions Architect Jobs in Austin
Solutions Engineer Jobs in Austin
Staff Software Engineer Jobs in Austin
Systems Engineer Jobs in Austin
Web Developer Jobs in Austin
All Filters
Total selected ()
No Results
No Results


































