dv01 Logo

dv01

MLOps Engineer

Posted Yesterday
Remote
Hiring Remotely in USA
185K-200K Annually
Senior level
Remote
Hiring Remotely in USA
185K-200K Annually
Senior level
Design, build, and operate an ML lifecycle platform to enable reproducible training, deployment, and monitoring of models. Implement CI/CD, containerized deployments on Kubernetes, infra-as-code, model observability, and governance. Mentor teams, define shared patterns, and partner with security and compliance to keep ML systems secure and production-ready.
The summary above was generated by AI

dv01 is lifting the curtain on the largest financial market in the world: structured finance. The $16+ trillion market is the backbone of everyday activities that empower financial freedom, from consolidating credit card debt and refinancing student loans, to buying a home and starting a small business.

dv01’s data analytics platform brings unparalleled transparency into investment performance and risk for lenders and Wall Street investors in structured products. As a data-first company, we wrangle critical loan data and build modern analytical tools that enable strategic decision-making for responsible lending.  In a nutshell, we're helping prevent a repeat of the 2008 global financial crisis by offering the data and tools required to make smarter data-driven decisions resulting in a safer world for all of us. 

More than 400 of the largest financial institutions use dv01 for our coverage of over 100 million loans spanning mortgages, personal loans, auto, buy-now-pay-later programs, small business, and student loans. dv01 continues to expand coverage of new markets, adding loans monthly, and developing new technologies for the structured products universe.

The Role

We're looking for an MLOps Engineer to build and operate the platform that gets our machine learning and AI work into production reliably. You'll own the lifecycle tooling and infrastructure that lets data science and engineering teams train, track, deploy, and monitor models without reinventing the wheel each time. This is a hands-on, senior-individual-contributor role: you'll set technical direction in your area and mentor less-experienced engineers, while spending most of your time building.

You Will

Build and operate the ML lifecycle platform. Own the tooling that makes model development reproducible and production-ready, with MLflow (or comparable systems) at the center: experiment tracking, model registry, artifact and metadata management, and versioned, repeatable training and inference pipelines.

Own CI/CD and deployment for ML workloads. Build automated pipelines that move models from notebook to production safely, including packaging, containerization, automated testing and validation, staged rollouts, and rollback.

Make models observable and reliable in production. Stand up monitoring for model and service health, including latency, drift, data-quality, and cost signals, with alerting and clear runbooks so issues surface and resolve quickly.

Build the cloud-native foundations. Contribute to and manage containerized workloads on Kubernetes and codify infrastructure with infrastructure-as-code tooling such as Terraform, keeping environments consistent, secure, and reproducible.

Establish sensible guardrails. Implement infrastructure-level governance for ML systems, including access controls, deployment policies, and auditability, partnering with security and compliance to align with our risk and regulatory requirements.

Enable and mentor the teams you support. Define repeatable patterns and shared services that reduce friction for data and application teams, provide technical guidance and mentorship to junior engineers, and contribute to the direction of dv01's MLOps practices.

You Have

4–7 years of relevant experience in platform engineering, DevOps, or MLOps, with solid experience operating systems in production.

Hands-on experience with ML lifecycle tooling. You've built or operated experiment tracking, model registry, and pipeline workflows using MLflow or similar platforms (e.g., Weights & Biases, Kubeflow, SageMaker, Vertex AI Pipelines). This is core to the role.

Strength in cloud-native infrastructure. You're comfortable with Kubernetes, containerized workloads, and infrastructure-as-code tools such as Terraform.

CI/CD fluency. You've designed and maintained automated build, test, and deployment pipelines, ideally for ML or data workloads.

Solid Python/Go skills and comfort supporting PyTorch-based production systems (deploying, serving, and operating them, not necessarily authoring the models).

An operations and security mindset. You understand infrastructure security, IAM, secrets management, and operational risk, and you build with secure, reliable defaults.

Clear communication and collaboration. You work well cross-functionally, can mentor and provide technical guidance, and are comfortable making pragmatic decisions in ambiguous problem spaces.

Nice to Have
  • Experience with GCP
  • Experience with Pulumi
  • Experience with GitHub Actions (GHA)
  • Experience with Go
  • Experience supporting data engineering platforms, data warehousing, or ETL/ELT operations
  • Exposure to LLM serving runtimes (e.g., vLLM, llama.cpp) or agentic systems and Model Context Protocol (MCP) servers
  • Familiarity with ML compiler stacks (e.g., LLVM/MLIR)
  • Experience designing benchmarking or evaluation frameworks for ML/AI systems
  • Familiarity with Excel Pivot Tables  

In good faith, our salary range for this role is $185,000–$200,000, but we are not tied to it. Final offer amount will be at the company’s sole discretion and determined by multiple factors, including years and depth of experience, expertise, and other business considerations. Our community is fueled by diverse people who welcome differing points of view and the opportunity to learn from each other. Our team is passionate about building a product people love and a culture where everyone can innovate and thrive.

BENEFITS & PERKS:

  • Unlimited PTO. Unplug and rejuvenate, however you want—whether that’s vacationing on the beach or at home on a mental-health day.
  • $1,000 Learning & Development Fund. No matter where you are in your career, always invest in your future. We encourage you to attend conferences, take classes, and lead workshops. We also host hackathons, brunch & learns, and other employee-led learning opportunities.
  • Remote-First Environment. People thrive in a flexible and supportive environment that best invigorates them. You can work from your home, cafe, or hotel. You decide.
  • Health Care and Financial Planning. We offer a comprehensive medical, dental, and vision insurance package for you and your family. We also offer a 401(k) for you to contribute.
  • Stay active your way! Get $138/month to put toward your favorite gym or fitness membership — wherever you like to work out. Prefer to exercise at home? You can also use up to $1,650 per year through our Fitness Fund to purchase workout equipment, gear, or other wellness essentials.
  • New Family Bonding. Primary caregivers can take 16 weeks off 100% paid leave, while secondary caregivers can take 4 weeks. Returning to work after bringing home a new child isn’t easy, which is why we’re flexible and empathetic to the needs of new parents.

dv01 is an equal opportunity employer and all qualified applicants and employees will receive consideration for employment opportunities without regard to race, color, religion, creed, sex, sexual orientation, gender identity or expression, age, national origin or ancestry, citizenship, veteran status, membership in the uniformed services, disability, genetic information or any other basis protected by applicable law.

Similar Jobs

4 Days Ago
In-Office or Remote
Senior level
Senior level
Software
Own and evolve Cint's shared AI/ML platform: audit current pipelines, design training infrastructure on Databricks, implement experiment tracking and model registry, build low-latency serving and batch scoring, implement ML observability (drift, metrics), optimize cost/performance, integrate with Java/Spring services, and mentor engineers to scale ML practices across teams.
Top Skills: Aws EksClaude CodeDatabricksGrafanaJavaJava SpringKubernetesPrometheusPythonScalaSparkTerraformUnity Catalog
11 Days Ago
Remote or Hybrid
United States
Mid level
Mid level
Artificial Intelligence • Cloud • Information Technology • Infrastructure as a Service (IaaS)
Design, deploy, and maintain HPC and MLOps infrastructure across cloud and on-prem clusters. Manage schedulers (Slurm/PBS), optimize MPI/CUDA stacks, storage and networking, automate deployments with Python/Bash, instrument systems, and enable AI training/inference pipelines while collaborating with product and support teams.
Top Skills: AWSAzureBashBeegfsCephCudaDebianGCPGpu DriversInfinibandLustreNfsObject StorageOciOpenmpiOpenpbsPbs ProPython (Asyncio)PyTorchRdmaRedhatRoceSlurmTensorFlow
12 Days Ago
In-Office or Remote
CA, USA
Mid level
Mid level
Agency • Artificial Intelligence • Blockchain • Web3
Build and operate MLOps and agentic infrastructure: manage model registries, continuous training loops, and A/B testing; deploy agents as scalable Kubernetes microservices; and create observability dashboards tracking token usage, latency, and agent reasoning.
Top Skills: A/B TestingContinuous TrainingDockerKubernetesLangsmithMicroservicesMlflowModel RegistryObservability DashboardsTerraformWeights & Biases

What you need to know about the Austin Tech Scene

Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.

Key Facts About Austin Tech

  • Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
  • Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
  • Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
  • Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account