Two Minute Papers: NVIDIA’s New AI: Video Game Graphics, Now 60x Smaller!
Dear Fellow Scholars, this is 2 Minute Tapers with Dr. Karo Zsolnai-Fehir, when creating a video game, an animated movie, or to create any believable virtual world, among many other things we need geometry. Tons and tons of geometry, both to store and to render images of our characters and the environment. And having lots of geometry means a big headache. You see, we can use traditional methods that store all the data needed for these objects and they also render super fast. But there is a problem. What is the problem? Well, size. These can take gigabytes that will eat not only your storage, but your cellular data plan as well. Now, do not despair. Modern learning-based methods are coming to the rescue in the form of nerves. That is neural radiance fields. Here, we don’t need to store all the geometry, just a few photos, and miraculously, they can fill in all the missing data. So, can that really be? Can we fly through these photos? Yes, we can. This is amazing. Now, it gets better. These are instant nerves. These can be trained in a matter of minutes, sometimes seconds, and rendered also in seconds. And look, we are light transport researchers over here. So, our keen eyes see that the specular and refractive objects work really well too. That is fantastic. And folks at LumaAI have already made an app that can create these nerves right on your phone. And some of these seem either production quality or really close to it. Yes, that’s right. All this comes from a paper that was published just two years ago. This is incredible progress in just two years. But wait, these nerve representations are much smaller than traditional techniques. However, they still put some strain on your cellular data plan. And don’t forget, they require considerable horsepower to render quickly. But good news, neural networks are excellent at compression. For instance, Nvidia already has a neural network that can help us compress even ourselves. You see, in this earlier work, the technique takes only the first image from our video, and throws away the entire video afterwards. But before that, it stores a tiny bit of information from it, which is data on how our head is moving over time and how our expressions change. That is an absolutely outrageous idea, except for the fact that it works. With this, our video data for a video call is thus at least as detailed as the traditional state of the art techniques, but uses a third as much data or even less. And get this in this new paper, scientist at Nvidia claim to solve both of our nerve problems at the same time. Compact and fast nerves at the same time. Now, I will believe it when I see it. Can they put the incredible compression capabilities of neural networks to more use here? Let’s see. Whoa! Look at that! You see the previous nerve-based method, and here the new method. And can that really be am I seeing correctly or am I dreaming? This new method requires 60 times less data to show us this geometry, and the quality as described by the signal to noise ratio is nearly the same. Not the same, but very close. And 60x cheaper, my goodness! And it gets even better that was just the compression part. What about the fast rendering part? Well, these models are so small, we can start downloading multiple versions of them at the same time, and the courses version can be shown to us after receiving only the first 10 kilobytes of data. That is equivalent to downloading less than one second of music. Wow! And as more data arrives, the finer details start to appear over time. And we are through the whole process very, very quickly. Incredible! But wait, we are experienced fellow scholars over here so we know that there are other papers on this topic. For instance, what about Planoxos? How does it compare to that? This is Planoxos, which is an amazing paper that we talked about earlier in this series. By the way, this earlier method did not use neural networks to achieve what you see here, which is incredible. And now, look at that! This new one, it can go at least as far with 20 megabytes of data as Planoxos can go with over 150. And Planoxos is not just some ancient paper, it is from just one year ago. Seeing this kind of improvement, just one year, and one more paper down the line, that is the power of human ingenuity. I love it! And you see, this might be the future of imagery. We don’t store the whole geometry of the objects anymore, we just take a few photos and let the AI feel in the rest. Bravo! What a time to be alive! What you see here is a report of this exact paper we have talked about, which was made by Wates and Biasis. I put a link to it in the description. Make sure to have a look, I think it helps you understand this paper better. Wates and Biasis provides tools to track your experiments in your deep learning projects. Using their system, you can create beautiful reports, like this one, to explain your findings to your colleagues better. It is used by many prestigious labs, including OpenAI, Toyota Research, GitHub, and more. And the best part is that Wates and Biasis is free for all individuals, academics, and open source projects. Make sure to visit them through www.nb.com slash papers, or just click the link in the video description, and you can get a free demo today. Our thanks to Wates and Biasis for their long-standing support, and for helping us make better videos for you. Thanks for watching and for your generous support, and I’ll see you next time.