Supplier.io Logo

Supplier.io

Senior Data Scientist

Reposted 2 Hours Ago
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
The Senior Data Scientist will design ML-based entity resolution systems, build and refine NLP and ML models, and translate model results into business impact while supporting data strategy and improving data pipelines.
The summary above was generated by AI

Supplier.io is the market leader in supplier intelligence, trusted by over half of the Fortune 100 to power smarter, more responsible sourcing decisions. Our platform helps corporate procurement teams discover, evaluate, and engage with over 11 million suppliers with a focus on local, small, diverse, and sustainable businesses. This helps organizations build supply chains that are resilient, inclusive, and built for impact.


Our solutions empower today’s procurement teams with accurate data, actionable insights, and measurable impact, which helps them mitigate risk, expand sourcing options, achieve ESG goals, and advance economic inclusion. Whether tracking spend, sourcing alternate suppliers, or measuring program results, Supplier.io transforms complexity into clarity; empowering teams to lead with confidence and build supply chains that deliver for both business and community.


Join a company committed to innovation, inclusion, and making a difference one sourcing decision at a time. For more information, visit www.supplier.io.


The Opportunity

 

Supplier.io is expanding our data team and is seeking a Senior Data Scientist with a strong data science orientation to play a critical role in scaling and modernizing our supplier intelligence platform. This role is weighted approximately 80% toward data science and 20% toward data engineering, which is ideal for someone with deep, hands-on experience building and training ML and NLP models and who is equally comfortable operationalizing those models within production data pipelines. You will bring strong architectural thinking, thrive in complex environments, and enjoy mentoring others while collaborating across teams, geographies, and disciplines.

 

A central focus of this role is Entity Resolution, which is the process of identifying, linking, and merging records across disparate data sources that refer to the same real-world entity (suppliers in our case). This involves resolving inconsistencies, handling missing data, and eliminating duplicates to create a single, accurate, and trustworthy supplier profile, often referred to as a “golden record” or 360-degree view. Our current systems leverage Lucene-based search and XGBoost ML models, and we are exploring the use of LLMs to further enhance these capabilities. The ideal candidate will improve and reimagine our existing legacy entity resolution systems, bringing experience with ML-based approaches to matching and deduplication at scale.

 

As a Senior Data Scientist, you will drive, shape, and execute our long-term data and data science strategy, design resilient and scalable data architectures, and champion technical excellence across our data ecosystem. You will work closely with Product and the Engineering teams to ensure our data systems support business growth, advance our matching capabilities, and enable data-driven decision-making.

 

To support Supplier.io growth, we are investing heavily in cloud-native technologies. This role will be instrumental in leveraging modern data services and ML capabilities, optimizing cost, and ensuring our data platform is secure, reliable, and scalable.

 

What You Will Do

 

  • Design, build, and iterate on ML-based entity resolution systems that match, link, and deduplicate supplier records across disparate data sources to produce trusted golden records.
  • Build, train, and refine NLP and ML models (e.g., XGBoost, search ranking models) for supplier matching, classification, and data enrichment, with a focus on improving accuracy and recall.
  • Evaluate and integrate emerging approaches, including LLMs, into our entity resolution and data intelligence workflows.
  • Own the full ML model lifecycle: feature engineering, training, evaluation, monitoring, feedback loops, and iterative tuning in partnership with data engineering and product teams.
  • Translate model results into business impact and clearly communicate tradeoffs, performance metrics, and recommendations to non-technical stakeholders.
  • Build and maintain data products end-to-end, operationalize them within production data pipelines, and ensure they deliver reliable, scalable results.
  • Execute and influence a cohesive data strategy that aligns with company objectives and supports analytics, reporting, and downstream product use cases.
  • Own complex data modeling initiatives, including dimensional and analytical models that support business intelligence and advanced analytics.
  • Drive continuous improvement by optimizing data pipelines, query performance, reliability, observability, and cost efficiency.
  • Partner with Infrastructure, Product, and Engineering teams to ensure data systems meet best practices, security standards, and business needs.
  • Create and maintain comprehensive technical documentation, including architecture diagrams, data flow maps, runbooks, and operations procedures.
  • Troubleshoot and resolve complex, cross-system data issues and incidents.

 

What You Will Need to Succeed:

 

  • Bachelor’s degree in Data Science, Computer Science, Machine Learning, Statistics, Engineering, or a related field.
  • 7+ years of progressive experience in data science and/or data engineering, with demonstrated ownership of ML-based systems in production environments. At least 2 years in a senior or lead capacity preferred.
  • Hands-on experience building NLP and LLM-based models in Python for real-world data science applications.
  • Strong understanding of ML model lifecycle considerations, including evaluation, monitoring, feedback loops, and iterative tuning in partnership with data engineering and product teams.
  • Strong ability to translate model results into business impact and communicate tradeoffs to non-technical stakeholders.
  • Direct experience building or significantly improving entity resolution or search ranking systems, including ML-based approaches to record matching, linking, and deduplication at scale.
  • Proficiency with ML frameworks and tools such as XGBoost, scikit-learn, PyTorch, or TensorFlow, and familiarity with search technologies such as Lucene/Elasticsearch.
  • Demonstrated ability to build and maintain data products end-to-end by operationalizing models within production data pipelines, not solely tuning them.
  • Advanced proficiency with Python and SQL for both data science and data engineering workflows.
  • Experience with Snowflake and cloud-native data platforms (Azure, AWS, GCP, or multi-cloud environments).
  • Familiarity with data modeling, ETL/ELT processes, and modern data warehousing principles.
  • Experience working in an agile development environment and collaborating through ticketing systems such as Jira and Github.
  • Ability to communicate technical concepts clearly to technical and non-technical teams and influence decision-making.
  • Strong problem-solving skills with the ability to troubleshoot and resolve ambiguous, high-impact issues.
  • A results-oriented mindset with a demonstrated history of driving process improvements and technical excellence.
  • Ability to work independently while also serving as a trusted technical partner and mentor to others.
  • Ability to take vague requirements and turn them into technical roadmaps.

We do no accept unsolicited resumes from recruitment/search firms.  

Supplier.io participates in E-Verify. For more information, click here. We will provide the Social Security Administration and, if necessary, the Department of Homeland Security, with information from each new employee’s Form I-9 to confirm work authorization. 

Supplier.io is an Equal Employment Opportunity employer. All qualified applicants will receive consideration for employment without regard to race color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status. 

Supplier.io is unable to sponsor work visas (e.g., H-1B, TN, OPT, etc.) for US positions.

If you require reasonable accommodation to complete the application or interview process, please contact the Human Resources department at [email protected] or 978-843-5747. 



Similar Jobs

10 Days Ago
In-Office or Remote
139K-225K Annually
Senior level
139K-225K Annually
Senior level
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
As a Senior Data Scientist, you'll lead AI risk assessments, manage compliance, and evaluate AI system behavior, ensuring safe deployment across banking use cases.
Top Skills: Adversarial TestingAIGenerative AiLarge Language ModelsLimeModel Risk ManagementPythonRisk Governance ToolsShapTransformer Models
10 Days Ago
In-Office or Remote
CA, USA
139K-225K Annually
Senior level
139K-225K Annually
Senior level
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
As a Senior Data Scientist, you will lead AI risk assessments, coordinate with stakeholders, and ensure effective AI governance practices for safe financial application deployment.
Top Skills: Python
11 Days Ago
Remote or Hybrid
CA, USA
139K-225K Annually
Senior level
139K-225K Annually
Senior level
Blockchain • Fintech • Mobile • Payments • Software • Financial Services
As the Senior Data Scientist for AI & Model Risk, you will lead AI risk assessments, coordinate with stakeholders, evaluate AI system designs, and ensure compliance with risk management principles in financial services.
Top Skills: Python

What you need to know about the Austin Tech Scene

Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.

Key Facts About Austin Tech

  • Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
  • Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
  • Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
  • Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account