Unlocking Secure AI Workloads with Confidential VMs

As the AI landscape continues to evolve, the need for secure and confidential computing has become a top priority. This move reflects broader industry trends towards prioritizing data protection and security in cloud computing. At the OpenInfra Summit Europe 2025, NVIDIA emphasized the importance of combining Kata Containers with Confidential Computing to preserve bare-metal GPU performance while preventing cloud operators from inspecting sensitive model and data.

Kata Containers, an open-source project, provides lightweight VMs for containers, using hardware virtualization technology to launch a separate VM for each container. This approach offers the performance benefits of containers along with the security and workload isolation of VMs. Confidential Computing, on the other hand, brings in-memory data and application encryption, ensuring that even the cloud provider cannot access sensitive information.

The combination of Kata Containers and Confidential Computing is not a silver bullet, but it substantially reduces the opportunity for cloud operators or co-tenants to access sensitive model artifacts or training data. As Zvonko Kaiser, NVIDIA principal systems engineer, explained, “We do not trust the infrastructure.” This approach holds that the workload is trusted, but the infrastructure is not, and therefore, the VM is encrypted, and even the cloud provider cannot snapshot or inspect guest memory.

NVIDIA is working to make GPU workloads “lift-and-shift” into Kata/confidential VMs without losing performance or functionality. This effort includes support for PCIe pass-through, Single Root IO Virtualization (SR-IOV), GPUDirect Remote Direct Memory Access (RDMA), and per-pod runtime configurations. The company’s Virtualization Reference Architecture (VRA) addresses the thorny problem of PCIe topology and peer-to-peer GPU communication inside VMs, supporting two approaches: flattening the hierarchy and host-topology replication.

The importance of attestation cannot be overstated, as it provides a cryptographic proof that the VM and its boot/guest state match an expected configuration. This enables a full-stack trust model across the control plane, worker nodes, and pods. NVIDIA is collaborating with Red Hat, IBM, and the open-source Kata community to upstream the VRA and tooling, including host-topology detection and performance guides.

In the context of the rapidly evolving AI landscape, NVIDIA’s approach to running sensitive AI workloads at scale has significant implications. By combining Kata Containers, Confidential Computing, and GPU device mapping abstractions, the company is paving the way for a new AI stack that prioritizes security and performance. As the industry continues to shift towards confidential computing, this development is likely to have a profound impact on the future of AI and cloud computing.

Source: https://thenewstack.io/how-to-get-bare-metal-gpu-performance-in-confidential-vms

About the Author