Restate Logo

Restate

Senior Cloud Infrastructure Engineer

Posted Yesterday
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
Design, build, and operate Restate Cloud and BYOC deployments across multi-tenant SaaS and on-prem environments. Implement IaC and cloud orchestration for Kubernetes-based stateful workloads, ensure reliability and observability (SLOs, metrics, traces, logs, runbooks), automate fleet scaling, and participate in on-call rotations supporting production operations.
The summary above was generated by AI
Senior Cloud Infrastructure Engineer at Restate

Restate (restate.dev) is a lightweight runtime that turns AI agents, workflows, and backend services into durable processes - so teams can focus on their logic, not failure mechanics.

The role: We're looking for a Senior to Staff-level cloud infrastructure engineer to work across all product pillars (OSS, on-prem deployments, Multi-tenant SaaS, BYOC; bring your own cloud). This means deep work in our Rust-based infrastructure layer, integrating with cloud provider APIs, building infrastructure-as-code tooling, and ensuring reliability and security at scale. You'll have significant ownership over major parts of our cloud infrastructure.

The opportunity


Front-row seat to the biggest infra shift in decades

Durable runtimes like Restate are becoming the next foundational infrastructure component - and increasingly a critical piece for AI applications. As systems become more agentic, long-running, integration-heavy, and failure-prone, durable execution turns reliability from a bespoke engineering tax into a default property. In this role, you’re not watching that shift from the sidelines - you help build the platform that enables it.

State-of-the-art tech, built from first principles

Restate re-imagines durable execution as a lightweight self-contained stack - no database required - and ships as a single Rust binary with an optimized custom storage layer, low latency orchestration, and an analytics engine for observability.

Enterprise Traction

Restate is already used by Fortune 500 companies, including Tier 1 banks running critical financial workflows, and also by cutting-edge AI and infra startups pushing the boundary of what “production-grade agents” mean. You’ll work on problems where reliability, correctness, and operational simplicity are existential.

Work with world-class engineers

You’ll partner directly with engineers who’ve built and operated foundational systems at scale - creators of Apache Flink, and leaders from Meta’s messaging infrastructure. You’ll have the chance to work with incredibly talented individuals who care deeply about their craft.

What you’ll do

This is a Cloud Infrastructure Engineering role spanning Restate’s product offering: OSS, on-prem deployments, Multi-tenant SaaS, BYOC. The scope of the role includes but is not limited to:

  • Build and operate Restate Cloud: extend our managed multi-tenant offering, working across the infrastructure, control plane, networking, storage, and observability of Restate workloads.

  • Evolve our BYOC product and work with customers on operating on-prem installations: design and build the infrastructure that runs inside customer cloud accounts.

  • Reliability and observability across the fleet: SLOs, metrics, traces, logs, alerting, and runbooks. Build automation so we can scale our product offering across deployment methods.

  • On-call: participate in the cloud on-call rotation. A US-based hire materially improves our timezone coverage.

What we’re looking forSenior to Staff profile

We’re targeting Senior-to-Staff: you’ve operated production SaaS or platform infrastructure before, you’ve seen real failure modes, and you have (strong) opinions about how to run multi-tenant systems. You have an appreciation for operating in a compliance-sensitive environment.

Must-Haves:
  • Strong cloud infrastructure background with deep understanding of major cloud provider architectures.

  • Experience with infrastructure-as-code and cloud orchestration, particularly Kubernetes-based stateful workloads; balancing continuous delivery with safety while maintaining large-scale production systems.

  • Software engineering skills in a systems language (Rust, Go, C++); willingness and ability to learn Rust on the job.

  • You should be comfortable taking ownership end-to-end, from design through production operations, and thrive in early-stage startup ambiguity.

Nice-to-Haves:
  • Prior experience with Restate or durable execution specifically.

  • Deep enterprise procurement/compliance navigation.

  • Kubernetes operator development, experience with IaC systems like Cluster API, Crossplane or Terraform.

Not a fit:
  • You want to work primarily on the runtime core rather than cloud, BYOC, and customer-facing infra.

  • You’ve mostly architected and reviewed, and aren’t excited to be hands-on.

  • You are averse to multi-cloud, Kubernetes, operating infrastructure as a shared responsibility with customers

Our stack:
  • We use Restate extensively: the Restate Cloud control plane is built on Restate and TypeScript.

  • Rust infrastructure services and Kubernetes operators.

Location and travel
  • US-based, fully remote. East Coast is a plus as it would materially improve our on-call coverage given the team’s existing geography.

  • Travel: minimal - occasional team offsites, little required customer travel.

Similar Jobs

5 Days Ago
Remote or Hybrid
12 Locations
148K-190K Annually
Senior level
148K-190K Annually
Senior level
Healthtech • Biotech
Lead design, implementation, and security of AWS cloud infrastructure and CI/CD automation. Drive cost optimization, observability, and IaC standards (Terraform/CDK). Mentor engineers, participate in on-call rotations, and collaborate with product and science teams to build scalable, reliable cloud-native systems.
Top Skills: AWSBashCdkGoInfrastructure-As-CodeLinuxPythonTerraform
8 Days Ago
Remote
US
108K-195K Annually
Senior level
108K-195K Annually
Senior level
Information Technology • Software
Maintain and secure a complex cloud-based CI/CD infrastructure for Air Force and Navy mission planning. Implement, patch, and troubleshoot Azure/AWS resources and servers, apply STIGs, remediate vulnerability scan findings, automate maintenance with scripts, support identity/authentication, and produce system documentation while ensuring DoD security compliance and high availability.
Top Skills: AWSAzureCi/CdComptia Security+Evaluate StigFdiskIostatIpv4Ipv6LdapLinuxLvmNessusNetstatPkiSAMLScapStigTopVlanVmstatVpnWindows
8 Days Ago
In-Office or Remote
CA, USA
152K-288K Annually
Senior level
152K-288K Annually
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Design, build, deploy, and operate large-scale GPU cloud infrastructure and tooling for AI training and inference. Perform performance analysis, capacity management, monitoring, automation, incident response, and lifecycle support for distributed multi-GPU/multi-node systems.
Top Skills: C/C++ContainersDgx CloudGoInfrastructure As Code (Iac)JavaKubernetesLinuxNetworkingOpenstackPublic CloudPythonSlurmStorageTerraform

What you need to know about the Austin Tech Scene

Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.

Key Facts About Austin Tech

  • Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
  • Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
  • Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
  • Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account