This role is for one of our clients
Compensation: $70-$110 per hour
Join a leading AI lab's cutting-edge Generative AI team and play a key role in developing next-generation large language models. We are seeking experienced MLOps and ML Systems Engineers with deep expertise in PyTorch and kernel-level programming frameworks such as Triton or Pallas.
In this role, you will contribute to AI model training and evaluation initiatives by designing, solving, and reviewing advanced machine learning infrastructure and systems challenges. Your expertise will help improve the quality of training data used to develop frontier AI systems.
This is a full-time (40 hours/week) engagement supporting high-impact AI research and engineering efforts.
RequirementsKey Responsibilities
- Partner with research and engineering teams to identify and address knowledge gaps in MLOps, machine learning infrastructure, and model training systems.
- Design challenging, real-world tasks focused on distributed training, ML frameworks, model optimization, and infrastructure engineering.
- Develop accurate, well-structured solutions to complex MLOps and ML systems problems.
- Evaluate technical tasks and solutions, providing detailed and actionable feedback.
- Create evaluation frameworks and scoring rubrics for training pipeline architecture, distributed systems reasoning, performance optimization, and kernel-level programming.
- Contribute domain expertise to improve AI model capabilities in machine learning engineering topics.
- Collaborate with other subject matter experts to ensure consistency, quality, and technical accuracy across datasets and evaluations.
- 2+ years of professional experience in ML Infrastructure, MLOps, ML Systems Engineering, or a closely related field.
- Strong hands-on experience building and operating production-scale machine learning systems.
- Advanced proficiency with PyTorch, including model training, optimization, and deployment workflows.
- Experience developing, tuning, or optimizing custom GPU kernels using Triton, Pallas, or similar frameworks.
- Demonstrated career growth and increasing technical responsibility.
- Ability to commit to a full-time, 40-hour-per-week schedule during standard business days.
- Excellent written communication skills and the ability to clearly explain complex technical concepts and engineering decisions.
- Experience with large-scale distributed training frameworks and infrastructure.
- Knowledge of GPU performance optimization and compiler-level ML tooling.
- Familiarity with modern AI training pipelines, model evaluation methodologies, and LLM development workflows.
- Experience mentoring engineers or contributing to technical standards and best practices.
- Background in cloud-native ML infrastructure and production deployment environments.
- Work alongside leading AI researchers and engineers on frontier AI systems.
- Influence the development and evaluation of next-generation large language models.
- Apply your expertise to solve challenging machine learning infrastructure and optimization problems.
- Contribute to high-impact projects at the forefront of AI innovation.
- Full-time engagement requiring 40 hours per week.
- Dedicated commitment is expected during the engagement period.
- Responsibilities and project scope may evolve based on research priorities and business needs.
All qualified applicants will be considered without regard to legally protected characteristics. Reasonable accommodations are available upon request.
Similar Jobs
What you need to know about the Austin Tech Scene
Key Facts About Austin Tech
- Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
- Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
- Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
- Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

