Contract GPU kernel optimization role: analyze profiler metrics, identify bottlenecks, and improve kernel performance across modern GPU hardware. Implement and modify C++17, Python, and GPU code using CUDA/HIP/shaders, document optimization decisions, and collaborate as a freelance specialist (20+ hrs/week preferred).
This role is for one of our clients
Compensation: $80-$100 per hour
We are seeking GPU kernel optimization experts to contribute to a project with a leading AI lab. This opportunity is designed for freelancers with strong C++ skills, practical GPU programming experience, and the ability to improve kernel performance using profiler-guided analysis. You’ll help evaluate, optimize, and reason about GPU kernels across modern hardware environments. This is a contract-based opportunity for specialists who enjoy squeezing performance out of modern GPU architectures.
RequirementsKey Responsibilities
- Analyze and optimize GPU kernels for performance, efficiency, and hardware utilization
- Use profiler metrics such as L2 cache hit rate, L2 throughput, occupancy, and related signals to guide kernel improvements
- Review GPU kernel implementations and identify bottlenecks without requiring extensive background in the underlying algorithms
- Write, modify, and reason about C++17, Python, and GPU programming code
- Apply CUDA, HIP, shader programming, or related kernel programming expertise to improve performance outcomes
- Document optimization decisions clearly, including when specific profiler metrics are or are not useful
- Available to work at least 20 hrs/wk
- Fluent in core C++ features through C++17
- Working knowledge of Python and Git
- Fluent in at least one GPU programming model, such as CUDA, HIP, Slang, HLSL, GLSL, or related kernel programming
- At least 1 year of professional or graduate-level research experience working with GPUs
- Strong understanding of GPU profiler performance metrics and how to use them to optimize kernels
- Ability to optimize GPU kernels without needing deep prior context on every algorithm
- Experience with CUDA, HIP, CUDA C++ Core Libraries, inline PTX assembly, or tensor core-level optimization is a plus
- Experience optimizing kernels for NVIDIA Blackwell hardware is a plus
- Familiarity with NSight Compute is a plus
- Prior experience with GPU hardware organizations such as NVIDIA, AMD, or Qualcomm is a plus
- Open-source contributions related to GPU kernel optimization are a plus
- Submit your resume or relevant technical background to get started
- Qualified applicants may be asked to complete a brief technical assessment or submit additional information
We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.
Contract and Payment Terms- You will be engaged as an independent contractor.
- This is a fully remote role that can be completed on your own schedule.
- Projects can be extended, shortened, or concluded early depending on needs and performance.
- Your work will not involve access to confidential or proprietary information from any employer, client, or institution.
- Payments are weekly on Stripe or Wise based on services rendered.
- Please note: We are unable to support H1-B or STEM OPT candidates at this time.
Similar Jobs
Aerospace • Artificial Intelligence • Hardware • Information Technology • Software • Defense • Manufacturing
Architect, deploy, and operate secure, scalable Azure infrastructure and IAM for mission-critical spacecraft and satellite systems. Implement IaC, security controls aligned to compliance frameworks, ZTNA, and standardized deployment patterns. Collaborate across IT, InfoSec, and networking to maximize availability, reliability, and security for production workloads.
Top Skills:
Azure Ad (Entra Id)BicepCmmcConditional AccessFedrampIdentity And Access Management (Iam)Infrastructure As Code (Iac)Iso 27001AzureRbacSoc 2TerraformZero Trust Network Access (Ztna)
Aerospace • Artificial Intelligence • Hardware • Information Technology • Software • Defense • Manufacturing
Lead design and delivery of Turion's AI-first enterprise platform (Hyperdrive). Architect scalable, real-time data pipelines, APIs, and AI/LLM integrations; mentor engineers; collaborate with hardware, supply chain, and finance to automate and optimize aerospace operations.
Top Skills:
Agentic WorkflowsAWSAzureData LakesETLGoJavaScriptKubernetesLlmsPostgresPythonRagReactSnowflakeTypescript
Aerospace • Artificial Intelligence • Hardware • Information Technology • Software • Defense • Manufacturing
Lead the development of orbit determination algorithms for commercial astronautics, focusing on spacecraft mission operations and advanced estimation techniques.
Top Skills:
Agi Stk AstrogatorC++FreeflyerGitGmatOrekitPython
What you need to know about the Austin Tech Scene
Austin has a diverse and thriving tech ecosystem thanks to home-grown companies like Dell and major campuses for IBM, AMD and Apple. The state’s flagship university, the University of Texas at Austin, is known for its engineering school, and the city is known for its annual South by Southwest tech and media conference. Austin’s tech scene spans many verticals, but it’s particularly known for hardware, including semiconductors, as well as AI, biotechnology and cloud computing. And its food and music scene, low taxes and favorable climate has made the city a destination for tech workers from across the country.
Key Facts About Austin Tech
- Number of Tech Workers: 180,500; 13.7% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Dell, IBM, AMD, Apple, Alphabet
- Key Industries: Artificial intelligence, hardware, cloud computing, software, healthtech
- Funding Landscape: $4.5 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Live Oak Ventures, Austin Ventures, Hinge Capital, Gigafund, KdT Ventures, Next Coast Ventures, Silverton Partners
- Research Centers and Universities: University of Texas, Southwestern University, Texas State University, Center for Complex Quantum Systems, Oden Institute for Computational Engineering and Sciences, Texas Advanced Computing Center

