How Nvidia Scaled Its Cloud Services With KubeVirt
In 2013, Nvidia decided that users should have the ability to play top-of-the-line games on top-of-the-line hardware without having to shell out $3,000 for a gaming PC. The company built GeForce NOW, an online service that made super fast GPU-backed gaming PCs in the cloud accessible to players anywhere in the world.
GeForce NOW grew in popularity to the point where it currently has 25 million subscribers. With all those users, the service isn’t in danger of sunsetting anytime soon, but there was a moment of truth for Nvidia around its original architecture. While Nvidia favors next-generation IT in almost all cases, GeForce NOW was built with virtual machines (VMs), not Linux containers, and this was causing problems for the service’s plans to scale out.
Scaling out such a service is an optimal use case for containers orchestrated by Kubernetes. But what happens if the original gaming platform was built on VMs, which are more rigid and less amenable to rapid scale out and scale down?
Enter KubeVirt, an open source platform for running container and virtualization workloads on premises or in the cloud.
Nvidia was built on high-end hardware for gaming. So how did Nvidia build an online gaming platform using containers and VMs?
First, some background.
What Is KubeVirt?
KubeVirt presents a unified shared platform where developers and administrators can build, modify and deploy applications in containers and VMs. KubeVirt allows VMs to be managed with the same software used to manage Kubernetes, whether you use Red Hat OpenShift or you roll your own (DIY).
KubeVirt places a virtual machine inside a Linux container. Thus you can manage VMs like other container-based assets on your platform. KubeVirt includes support for VM snapshots, live migration, memory hot plugging, non-uniform memory access (NUMA), huge pages, virtual networking and storage. You get access to all the VM features you expect when dealing with virtual machines at scale, but manageable through the same tools you use for Kubernetes.
How Nvidia Scaled KubeVirt
Nvidia had to satisfy thousands of gamers who expected a gaming experience equivalent to a PC on their desk, not in the cloud.
At KubeCon Paris 2024, Ryan Hallisey and Alay Patel from Nvidia presented some benchmarks for KubeVirt. The pair highlighted how the community drastically increased the performance of KubeVirt and showcased their benchmarking tools. The team wanted to move to a more microservices-based approach, said Hallisey. “How do we do this without completely abandoning our investment? This is where we looked at adopting KubeVirt. The next generation of GeForce NOW infrastructure is based on KubeVirt and Kubernetes.”
Managing and Automating VM Infrastructure on KubeVirt
KubeVirt is the hosting plane for VMs in a Kubernetes platform, and other tools provide automation and management. Ansible is an excellent automation tool for KubeVirt, as is GitOps, which maintains the state of clusters in a Git repository.
KubeVirt is a viable option for taking on workloads currently on other VM platforms. You can migrate them to KubeVirt using the open source project Konveyor forklift, as this video demonstrates.
Once you’re up and running, there is a lot of activity within the platform’s open source community. At DevConf in June 2024, Red Hat’s Lee Yarwood explored the state of VM creation in KubeVirt. And at Cloud Native Rejekts 2023, Cloudera’s Shane Kumpf showed how the company moved toward a hyperconverged infrastructure using KubeVirt. Join the community and interact with other users.
Conclusion
KubeVirt scales to thousands of users, providing a side-by-side platform for VMs and containers managed by one set of tools. Combining the control plane and management of containers and VMs reduces the load on developers and systems administrators.
Migrating your workloads from another virtualization platform to KubeVirt can start with Konveyor and then be automated and managed by other tools. KubeVirt is a viable alternative virtualization platform offering standard VM features, an ecosystem of partners and scale-out performance.