Graphcore Logo

Graphcore

Server CPU Hardware Systems Lead - US

Posted An Hour Ago
Be an Early Applicant
Hybrid
Austin, TX, USA
Expert/Leader
Hybrid
Austin, TX, USA
Expert/Leader
Lead systems engineer for blade and rack validation of ARM/x86 server compute racks. Drive first-silicon bring-up, firmware integration, lab and debug tool development, test plan ownership, triage and resolution of HW/FW/SW issues, and cross-functional coordination to meet program milestones and improve validation capabilities.
The summary above was generated by AI

We are looking for a disciplined and dynamic, Lead System Engineer – compute blade and rack Validation to join our growing compute rack validation team. As a diligent leader in Systems Engineering, you will drive multiple aspects of validation throughout the life cycle of the program. In this high visibility position, you will be part of a leading team to innovate and improve system bring-up and enablement abilities, as well as silicon and system validation to deliver the highest quality, industry leading technologies to market. Your technical leadership skills, validation and debug expertise will be necessary towards product development, definition, root cause and resolution. Your agility and collaborative approach will be essential to work within System Validation & other engineering teams (System Architects, SoC and Rack FW etc). 

The technical leader will be driving keys areas of system validation including leading first silicon & system bring-up (nodes and rack level systems) - rack level systems and blades will be based of ARM server architecture. Candidate will be immersed in challenging system enablement work, system validation (end-to-end) methodology, tests development and execution as well as triage/debug of critical issues to meet critical program milestones at POR quality. The candidate will also be a key contributor to state-of-the-art HW and lab capabilities for Grapchore’s system engineering. The candidate should be able to work in a global environment while maintaining a synergetic culture. 


Primary Responsibilities: 

  • Lead the systemenablement (including first silicon and other FW components) to ensure system capabilities are brought up as per plan of record and system architecture spec. 
  • Drive organization wide methodology for Firmware integration and best known configuration (HW/FW/SW) usage model by leading the release of deployment ready solutions.
  • Develop key methodologies, lab HW and system SW capabilitiesas well as system visibilities and debug tools necessary for successful system (HW/SW/FW) bring-up and system validation at blade and rack level for AI compute rack. 
  • Triageissues found during server rack validation bring-up, Post-Silicon Validation, and production phases of the program. Ensure issues are solved on time with quality.     
  • Develop and own test plans,lead test case development and execution of key domains within AI compute solutions like CPU, GPU, memory, HBM, IO etc. 
  • Drive technical innovation to improve capabilities acrosssystem validation, including tool, script development, technical and procedural methodology enhancement, and various internal and cross-functional technical initiatives. 

Qualifications: 

  • Strong analytical/problem-solving skills and pronounced attention to details 
  • Extensive experience in validation roles involvingfirst silicon and system bring up, OS, FW, Silicon, and HW   
  • Understanding of PC industry standardbuses and their software stack, such as  PCIe, CXL. 
  • Proven experience in understanding,defining and enabling storage (storage rack), networking capabilities (network rack, DNS, DHCP etc) in a lab environment to help add end-to-end validation and debug capabilities for rack and blade validation. 
  • Strong knowledge ofARM CPU or X86 architecture, SoC design, memory, RAS & power management as well as HW/SW based tools, and revision control systems. 
  • Extensive knowledge of system architecture, technical debug, and validation strategy 
  • Good understanding and experience in platform/ system level debug, Operating System, DeviceDrivers and System BIOS interactions. 
  • Excellent communication and coordination skills. 
  • Detailed oriented, highly organized, able to prioritize, and juggle multiple workstreams to tight deadlines. 
  • Technical leadership: capable of championing new tools, methods, and capabilities to drive platform validation improvements in schedule, quality, or coverage.
  • Deep experience with Linux and Windows Operating Systems, Hypervisors (VMware, KVM, Hyper-V, etc.), and development and certification processes of these environments
  • Experience in technical program management.
  • A thorough understanding of datacenter industry technologies and their software stack.
  • Must be a self-starter, and able to independently drive tasks to completion

Preferred Qualifications: 

  • Mastersor PhD in Electrical Engineering, Computer Engineering or a related  
  • 14+ years of work experience demonstrating working on complex systems engineering challenges to validate and debug HW-FW-SW challenges in a server compute rack or data center blade environment. 
  • Experience designing and deploying modern AI/ML rack scale systems 
  • Knowledge of industry standards and best practices for hardwaredevelopment 
  • Familiarity with emerging technologies in AI andData Center  
  • Comfortable meeting,engaging and collaborating with ODM partners across the globe.  

USA Benefits
In addition to a competitive salary, Graphcore offers flexible working and a comprehensive benefits package designed to support your health, wellbeing and financial future. Our benefits include medical, dental and vision coverage, Flexible Spending Accounts (FSAs), Health Savings Accounts (HSAs), disability and life insurance, a 401(k) retirement plan, commuter benefits, wellness services and an Employee Assistance Programme (EAP). We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.


Graphcore Austin, Texas, USA Office

Graphcore Austin Office Office

Austin, TX, United States

Similar Jobs at Graphcore

An Hour Ago
Hybrid
Austin, TX, USA
Expert/Leader
Expert/Leader
Artificial Intelligence • Semiconductor
Lead server and blade rack bring-up, install and configure servers, manage inventory via DCIM, run post-silicon validation and debug for CPU/GPU/HBM/IO, develop lab validation tools and scripts, coordinate data center projects and vendors, and drive technical improvements in system validation.
Top Skills: Asset ManagementCisco UcsCopper CablingDcimDell MxDhcpDnsFiber Optic CablingFirmware FlashingGpuHbmHpe SynergyLinuxPythonStorage SystemsTicketing/Change Management SystemsUbuntu
Yesterday
Hybrid
Austin, TX, USA
Mid level
Mid level
Artificial Intelligence • Semiconductor
The Electrical Engineer will design hardware systems for AI applications, develop schematics, create PCBs, and collaborate on designs through production.
Top Skills: Analog And Digital Circuit DesignLab EquipmentPcb DesignPower Integrity Analysis ToolsSchematic CaptureServer HardwareSignal Integrity Analysis Tools
2 Days Ago
Hybrid
Austin, TX, USA
Mid level
Mid level
Artificial Intelligence • Semiconductor
The Staff AI Performance Engineer will optimize performance across ARM-based architectures and distributed systems, analyzing AI workloads and collaborating to enhance system efficiency.
Top Skills: ArmC++MlperfMpiNcclPythonPyTorchRdmaTensorFlowUcx

What you need to know about the Austin Tech Scene

Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.

Key Facts About Austin Tech

  • Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
  • Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
  • Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
  • Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account