EXL Logo

EXL

Senior Data Engineer

Posted 5 Days Ago
Be an Early Applicant
Remote or Hybrid
Hiring Remotely in United States
94K-154K Annually
Senior level
Remote or Hybrid
Hiring Remotely in United States
94K-154K Annually
Senior level
Design, build, and operate production-grade, event-driven data pipelines (Kafka/Flink) on GCP to deliver model-ready features. Optimize BigQuery SQL and Parquet performance, develop Python data workloads (Polars/Pandas), deploy ML pipeline components on Kubeflow/Vertex AI with Docker, design event store architectures, and collaborate with ML and platform teams while documenting architecture and standards.
The summary above was generated by AI

EXL is hiring a Senior Data Engineer to join a strategic AI / ML platform engagement with a leading specialty retailer. This is a hands-on build role embedded with the client's platform engineering team.

The role requires shipping production-grade data pipelines that feed real-time customer event data into machine learning workflows. The right person is comfortable owning the full lifecycle of pipeline design, build, and deployment: from streaming ingestion through event store design to model-ready feature delivery.

This is a high-visibility role with growth potential into a larger book of work as the engagement expands.

Salary Range: $93,900 - $154,200 annual base 

The posted range is the hiring range for this role — a subset of the broader range available to employees over time — and reflects base salary across our national hiring scale. Final offers are based on several factors, including the candidate's skills and experience, internal pay equity, work location, market conditions for the role, and the specific scope and responsibilities of the position. The top of the range is reserved for candidates who notably exceed the requirements; the lower end applies to those with less experience or fewer preferred qualifications. For positions based in higher-cost zones (e.g., California, New York, New Jersey), actual compensation may exceed the posted range; your recruiter will share specifics during the process.

Responsibilities
What You'll Do
  • Design and operate event-driven data pipelines using Kafka consumers and Flink jobs to process high-volume customer events (clicks, purchases, returns) in near-real time.
  • Build and optimize large-scale data transformations on Google Cloud Platform — BigQuery SQL, query performance tuning, and partitioning strategy at scale.
  • Develop Python data engineering workloads using Polars or Pandas at scale, with rigorous attention to Parquet partitioning, join performance on large datasets, and memory efficiency.
  • Build, deploy, and maintain ML pipeline components on Kubeflow Pipelines (KFP) and Vertex AI; package and deploy services with Docker.
  • Design event store architecture: partitioning by customer, time-ordered event assembly across heterogeneous sources, and schema management for mixed event types.
  • Partner with ML engineers, platform engineers, and data scientists to deliver clean, performant, model-ready data products.
  • Document architecture decisions and contribute to engineering standards across the platform team.
Qualifications
Required Skills & Experience
  • 6–12 years of experience in data engineering, platform engineering, or a closely related discipline.
  • Streaming: Production experience with Kafka consumers and Flink stream processing — building, deploying, and operating streaming jobs at meaningful scale.
  • GCP Data Stack: Strong SQL on BigQuery (or an equivalent cloud warehouse), with demonstrated query optimization, cost management, and partitioning chops.
  • Python Data Engineering: Hands-on with Polars or Pandas at scale; deep working knowledge of Parquet partitioning and performance on large joins.
  • ML Pipelines: Hands-on experience building and deploying components on Kubeflow Pipelines (KFP) and/or Vertex AI Pipelines; working proficiency with Docker.
  • Event Store Design: Demonstrated experience designing event stores — partitioning by customer, time-ordered event assembly across sources, schema strategy for mixed event types (clicks, purchases, returns).
  • Communication: Strong written and verbal communication; comfortable being the senior IC voice in design conversations with client stakeholders.
Nice to Have
  • Domain experience in Retail or E-commerce — customer journey data, transaction analytics, returns and exchanges modeling.
  • Exposure to schema registry tooling (e.g., Confluent), Iceberg, or Delta Lake.
  • Experience working in client-facing or consulting engagements.
  • Google Cloud certifications (Professional Data Engineer or equivalent).
Work Arrangement & Eligibility
  • This role requires 3–4 days per week onsite in Seattle, WA. Fully remote and out-of-state candidates will not be considered.
  • EXL is open to sponsoring H1B transfers for qualified candidates.

Similar Jobs

Yesterday
Remote or Hybrid
US
135K-155K Annually
Senior level
135K-155K Annually
Senior level
Professional Services • Software
Lead architecture and buildout of a new graph-backed enterprise data platform: design ingestion, graph and relational storage, entity resolution pipelines, temporal models, ETL/ELT pipelines, governance, APIs, and production connectors. Ship scalable graph data models, traversal queries, and platform roadmap while enabling observability, security, and containerized deployments.
Top Skills: AirflowAzureCypherDagsterDbtDockerGremlinHelmJavaKubernetesPythonSalesforceServicenowSparqlSQL
2 Days Ago
In-Office or Remote
92K-164K Annually
Senior level
92K-164K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Design, build, and maintain enterprise ETL and data transformation pipelines to support Medicaid analytics and federal reporting. Optimize data processing with Python, Spark/Databricks, and relational platforms; ensure data validation, reconciliation, auditability, and production support. Collaborate across architects, analysts, QA, and BI teams during cloud migration and modernization efforts.
Top Skills: Azure Data FactoryAzure DevopsBashCi/CdDatabricksGitInformatica PowercenterOraclePowershellPythonRest ApiSnowflakeSparkSQLSQL ServerTeradata
4 Days Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
186K-222K Annually
Senior level
186K-222K Annually
Senior level
eCommerce • Healthtech • Kids + Family • Retail • Social Media
Design and scale data pipelines and ML/LLM systems, build agentic automation for pipeline generation and maintenance, improve data monitoring, and collaborate with analysts, product, and ML teams to deliver reliable end-to-end data and AI infrastructure for a high-growth e-commerce platform.
Top Skills: AirflowAws Ec2Aws EksAws LambdaAws S3DbtLlmsMcp ServersMl PipelinesPythonRagSnowflake

What you need to know about the Austin Tech Scene

Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.

Key Facts About Austin Tech

  • Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
  • Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
  • Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
  • Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account