Skills & Keywords

LLMGenerative AI

Job Description

We are now looking for a Deep Learning Architect, LLM Inference! NVIDIA is at the forefront of the generative AI revolution. The Inference Benchmarking (IB) team specifically focuses on inference server performance optimization for Large Language Models (LLMs). If you're passionate about pushing the boundaries of GPU hardware and software performance and understand terms like disaggregated serving, data parallel attention, MoE, Qwen3.5, DeepSeek, GPT-OSS, then this is a great role for you! What

View full posting

Apply Now

Deep Learning Architect, LLM Inference - New College Grad 2026

Skills & Keywords

Job Description

Similar Roles