Back to jobs
N
Human

Principal Software Engineer – Large-Scale LLM Memory and Storage Systems

NVIDIA

US, CA, Santa Clara workday 5mo ago
Apply Now

Get roles like this in your inbox

New agentic AI jobs, curated every Thursday. No spam.

Skills & Keywords

LLMRAGGenerative AI

Job Description

NVIDIA Dynamo is a high-throughput, low-latency inference framework for serving generative AI and reasoning models across multi-node distributed environments. Built in Rust for performance and Python for extensibility, Dynamo orchestrates GPU shards, routes requests, and manages shared KV cache across heterogeneous clusters so that many accelerators feel like a single system at datacenter scale. As large language models rapidly outgrow the memory and compute budget of any single GPU, this platfo

View full posting

Similar Roles