eBay Logo

eBay

AI Platform Systems Software Engineer

Reposted 7 Days Ago
Be an Early Applicant
In-Office
2 Locations
132K-222K Annually
Senior level
In-Office
2 Locations
132K-222K Annually
Senior level
The role involves designing and optimizing AI/ML infrastructure, ensuring scalable solutions across cloud and on-prem environments, and collaborating with teams on performance and reliability of AI workloads.
The summary above was generated by AI

At eBay, we're more than a global ecommerce leader — we’re changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We’re committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.

Our customers are our compass, authenticity thrives, bold ideas are welcome, and everyone can bring their unique selves to work — every day. We're in this together, sustaining the future of our customers, our company, and our planet.

Join a team of passionate thinkers, innovators, and dreamers — and help us connect people and build communities to create economic opportunity for all.

About the team & role:

At eBay, we are building the next-generation AI platform to power experiences for millions of users worldwide. Our AI Platform (AIP) provides the scalable, secure, and efficient foundation for deploying and optimizing advanced machine learning and large language model (LLM) workloads at production scale. We enable teams across eBay to move from experimentation to global deployment with speed, reliability, and efficiency.

We are seeking an experienced AI Platform Systems Software Engineer (Infrastructure) to join our AI Platform team. In this role, you will design, implement, and optimize the core infrastructure that powers AI/ML workloads across eBay. You will work on highly distributed systems, cloud-native services, and performance-critical components that make large-scale inference and training possible.

You will be part of the team responsible for both the control plane (cluster management, scheduling, user access) and the data plane (execution, resource allocation, accelerator integration). Your work will directly impact the scalability, performance, and reliability of AI applications that serve eBay’s global marketplace.

What you will accomplish:

  • Design and scale services to orchestrate AI/ML clusters across cloud and on-prem environments, supporting VM and Kubernetes-based deployments, including Ray (ray.io) clusters for distributed training and online inference.

  • Develop and optimize intelligent scheduling and resource management systems for heterogeneous compute clusters (CPU, GPU, accelerators).

  • Integrate Ray Train/Tune for large-scale distributed training workflows and Ray Serve for low-latency, autoscaled inference; build platform hooks for observability, canary/A-B rollouts, and fault tolerance.

  • Build features to improve reliability, performance, observability, and cost-efficiency of AI workloads at scale.

  • Enhance the control plane to support secure multi-tenancy and enterprise-grade governance.

  • Implement systems for container management, dependency resolution, and large-scale model distribution.

  • Collaborate with ML researchers, applied scientists, and distributed systems engineers to drive platform innovation.

  • Provide production support and work closely with field teams to resolve infrastructure issues.

What you will bring:

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field (or equivalent experience).

  • 8-10 years of experience building and maintaining infrastructure for highly available, scalable, and performant distributed systems.

  • Proven expertise with cloud-native technologies (AWS, GCP, Azure) and Kubernetes-based deployments.

  • Hands-on experience running ML training and inference with Ray (ray.io)—e.g., Ray Train/Tune for distributed training and Ray Serve for production inference—covering autoscaling, fault tolerance, observability and multi-tenant operations.

  • Deep understanding of networking, security, authentication, and identity management in distributed/cloud environments.

  • Hands-on experience with observability stacks (Prometheus, Grafana, OpenTelemetry, etc.).

  • Strong coding skills in Go and/or Python; familiarity with other systems-level languages is a plus.

  • Knowledge of Linux internals, containers, and storage systems.

  • Experience optimizing for GPU/accelerator integration (NVIDIA, AMD, TPU, etc.) is highly desirable.

#LI-Hybrid

The base pay range for this position is expected in the range below:

$132,000 - $222,100

Base pay offered may vary depending on multiple individualized factors, including location, skills, and experience. The total compensation package for this position may also include other elements, including a target bonus and restricted stock units (as applicable) in addition to a full range of medical, financial, and/or other benefits (including 401(k) eligibility and various paid time off benefits, such as PTO and parental leave). Details of participation in these benefit plans will be provided if an employee receives an offer of employment.

If hired, employees will be in an “at-will position” and the Company reserves the right to modify base salary (as well as any other discretionary payment or compensation program) at any time, including for reasons related to individual performance, Company or individual department/team performance, and market factors.

Please see the Talent Privacy Notice for information regarding how eBay handles your personal data collected when you use the eBay Careers website or apply for a job with eBay.

eBay is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, sex, sexual orientation, gender identity, veteran status, and disability, or other legally protected status. If you have a need that requires accommodation, please contact us at [email protected]. We will make every effort to respond to your request for accommodation as soon as possible. View our accessibility statement to learn more about eBay's commitment to ensuring digital accessibility for people with disabilities. It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.

 

The eBay Jobs website uses cookies to enhance your experience. By continuing to browse the site, you agree to our use of cookies. Visit our Privacy Center for more information.

Top Skills

AWS
Azure
GCP
Go
Grafana
Kubernetes
Opentelemetry
Prometheus
Python
Ray

eBay Austin, Texas, USA Office

7700 W Parmer Ln, Building D, Austin, Texas, United States, 78729

Similar Jobs

3 Hours Ago
Remote or Hybrid
Texas, USA
42K-44K Annually
Junior
42K-44K Annually
Junior
Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
The Customer Service Processor manages assignments from clients/lenders in a high-volume environment, processes vehicle releases, and ensures customer needs are met efficiently.
Top Skills: Google SuiteMS Office
3 Hours Ago
In-Office
3 Locations
280K-320K Annually
Senior level
280K-320K Annually
Senior level
Cloud • Hardware • Security • Software
The Enterprise Account Executive will drive new business acquisition in the Federal market, specifically the Department of Veterans Affairs, focusing on territory development and executing sales strategies to increase revenue.
Top Skills: Cloud SoftwareHardwareSaaSSoftware
3 Hours Ago
Remote or Hybrid
TX, USA
109K-170K Annually
Senior level
109K-170K Annually
Senior level
Artificial Intelligence • eCommerce • Information Technology • Internet of Things • Automation
The Product Manager - Subscriptions leads product strategy and execution, collaborates with cross-functional teams, and manages product backlogs and roadmaps to drive innovation and achieve business goals.
Top Skills: Certinia PsaLucidchartMS OfficeSalesforce CpqSalesforce CRMSalesforce Revenue CloudZuora

What you need to know about the Austin Tech Scene

Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.

Key Facts About Austin Tech

  • Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
  • Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
  • Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
  • Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account