EPISODES:
Two Minute Papers: Perfect Virtual Hands – But At A Cost! Two Minute Papers: Virtual Characters Learn To Work Out … and Undergo Surgery Two Minute Papers: This is What Abraham Lincoln May Have Looked Like! Two Minute Papers: This AI Learned Boxing … with Serious Knockout Power! Two Minute Papers: Everybody Can Make Deepfakes Now! Two Minute Papers: AI Learns To Compute Game Physics In Microseconds! Two Minute Papers: DeepFake Detector AIs Are Good Too! Two Minute Papers: This AI Clones Your Voice After Listening for 5 Seconds Two Minute Papers: This AI Does Nothing In Games … And Still Wins! Two Minute Papers: OpenAI Five Beats World Champion DOTA2 Team 2-0! Two Minute Papers: 6 Life Lessons I Learned From AI Research Two Minute Papers: DeepMind’s AlphaStar Beats Humans 10-0 (or 1) Two Minute Papers: OpenAI Plays Hide and Seek … and Breaks The Game! Two Minute Papers: 4 Experiments Where the AI Outsmarted Its Creators Two Minute Papers: AI Learns to Animate Humanoids Two Minute Papers: Ken Burns Effect, Now In 3D! • Two Minute Papers: This AI Creates Human Faces From Your Sketches! • Two Minute Papers: Google’s New AI Puts Video Calls On Steroids! • Two Minute Papers: New AI Research Work Fixes Your Choppy Videos! • Two Minute Papers: Can an AI Learn Lip Reading? • Two Minute Papers: Two Shots of Green Screen Please! • Two Minute Papers: This AI Creates Dessert Photos … and more! • Two Minute Papers: NVIDIA’s AI Dreams Up Imaginary Celebrities #207 • Two Minute Papers: Beautiful Gooey Simulations, Now 10 Times Faster • Two Minute Papers: DeepMind’s New AI Dreams Up Videos on Many Topics • Two Minute Papers: How Do Genetic Algorithms Work? #32 • Two Minute Papers: AI Makes 3D Models From Photos #122 • Two Minute Papers: What is De-Aging? • Two Minute Papers: This AI Made Me Look Like Obi-Wan Kenobi! • Two Minute Papers: DeepMind’s AI Learns Locomotion From Scratch | Two Minute Papers #190 Two Minute Papers: DeepMind’s WaveNet, 1000 Times Faster | Two Minute Papers #232 Two Minute Papers: This is How You Hack A Neural Network Two Minute Papers: We Can All Be Video Game Characters With This AI ★★★★★ Two Minute Papers: DeepMind’s New AI Helps Detecting Breast Cancer • Two Minute Papers: Artistic Style Transfer For Videos #68 • Two Minute Papers: OpenAI’s Whisper Learned 680,000 Hours Of Speech! Two Minute Papers: Ubisoft’s New AI: Breathing Life Into Games! Two Minute Papers: How To Get Started With Machine Learning? #51 Two Minute Papers: Google’s New AI: Fly INTO Photos! Two Minute Papers: NVIDIA’s AI Removes Objects From Your Photos | Two Minute Papers #255 Two Minute Papers: Stable Diffusion Is Getting Outrageously Good! Two Minute Papers: OpenAI Dall-E 2 – AI or Artist? Which is Better? Two Minute Papers: Google’s New AI Learns Table Tennis! Two Minute Papers: NVIDIA’s New AI: Video Game Graphics, Now 60x Smaller! Two Minute Papers: New AI Makes Amazing DeepFakes In a Blink of an Eye! Two Minute Papers: This New AI Is The Future of Video Editing! Two Minute Papers: How Does Deep Learning Work? #24 •

Two Minute Papers: Google’s New AI: Fly INTO Photos!

Dear Fellow Scholars, this is Two Minute Papers with Dr. Carlos John Fahir. Today, we are able to take a bunch of photos and use an AI to magically create a video where we can fly through these photos. It is really crazy because this is possible today. For instance, here is Nvidia’s method that can be trained to perform this in a matter of seconds. Now, I said that in these we can fly through these photos, but here is an insane idea. What if we used not multiple photos, but just one photo, and we don’t fly through it, but fly into this photo. Now you are probably asking, Karoi, what are you talking about? This is completely insane and it wouldn’t work with these nerve-based solutions like the one you see here. These were not designed to do this at all. Look, oh yes, that. So in order to fly into these photos, we would have to invent at least three things. One is image in painting. If we are to fly into this photo, we will have to be able to look at regions between the trees. Unfortunately, these are not part of the original photo and hence, new content needs to be generated intelligently. That is a formidable task for an AI and luckily image in painting techniques already exist out there. Here is one. But in painting is not nearly enough. Two, as we fly into a photo, completely new regions should also appear that are beyond the image. This means that we also need to perform image out painting, creating these new regions, continuing the image, if you will. Luckily, we are entering the age of AI-driven image generation and this is also possible today, for instance, with this incredible tool. But even that is not enough. Why is that? Well, three, as we fly closer to these regions, we will be looking at fewer and fewer pixels and from closer and closer, which means this. Oh my, another problem. Surely, we can’t solve this, right? Well, great news we can. Here is Google’s diffusion-based solution to super-resolution, where the principle is simple. Have a look at this technique from last year, in goes a course image or video, and this AI-based method is tasked with this. Yes, this is not science fiction, this is super-resolution, where the AI starts out from noise and synthesizes crisp details onto the image. So this might not be such an insane idea after all. But, does the fact that we can do all three of these separately mean that this task is easy? Well, let’s see how previous techniques were able to tackle this challenge. My guess is that this is still sinfully difficult to do. And oh boy, well, I see a lot of glitches and not a lot of new, meaningful content being synthesized here. And note that these are not some ancient techniques, these are all from just two years ago. It really seems that there is not a lot of hope here. But don’t despair, and now hold onto your papers, and let’s see how Google’s new AI puts all of these together and lets us fly into this photo. Wow, this is so much better. I love it. Not really, not perfect, but I feel that this is the first work where the flying into photos concept really comes into life. And it has a bunch of really cool features too. For instance, one, it can generate even longer videos, which means that after a few seconds everything that we see is synthesized by the AI. Two, it supports not only this boring linear camera motion, but these really cool, curvy camera trajectories too. Putting these two features together, we can get these cool animations that were not possible before this paper. Now the flaws are clearly visible for everyone, but this is a historic episode where we can invoke the three laws of papers to address them. The first law of papers says that research is a process. So not look at where we are, look at where we will be two more papers down the line. With this concept, we are roughly where Dolly 1 was about a year ago. That is an image generator AI that could produce images of this quality. And just one year later Dolly 2 arrived, which could do this. So just imagine what kind of videos this will be able to create just one more paper down the line. The second law of papers says that everything is connected. This AI technique is able to learn image in painting, image outpainting and super resolution at the same time and even combine them creatively. We don’t need three separate AI’s to do this anymore, just one technique. That is very impressive. And the third law of papers says that a bad researcher fails 100% of the time while a good one only fails 99% of the time. Hence what you see here is always just 1% of the work that was done. Why is that? Well for instance this is a neural network based solution which means that we need a ton of training data for these AI’s to learn on. And hence scientists at Google also needed to create a technique together a ton of drone videos on the internet and create a clean data set also with labelling as well. The labels are essentially depth information which shows how far different parts of the image are from the camera. And they did it for more than 10 million images in total. So once again if you include all the versions of this idea that didn’t work what you see here is just 1% of the work that was done. And now we can not only fly through these photos but also fly into photos. What a time to be alive. So what do you think? Does this get your mind going? What would you use this for? Let me know in the comments below. This video has been supported by weights and biases. Being a machine learning researcher means doing tons of experiments and of course creating tons of data. But I am not looking for data, I am looking for insights. And weights and biases helps with exactly that. They have tools for experiment tracking, data set and model versioning and even hyperparameter optimization. No wonder this is the experiment tracking tool choice of open AI, Toyota Research, Samsung and many more prestigious labs. Make sure to use the link wnb.me slash paper intro or just click the link in the video description and try this 10 minute example of weights and biases today to experience the wonderful feeling of training a neural network and being in control of your experiments. After you try it you won’t want to go back. Thanks for watching and for your generous support and I’ll see you next time.

AI video(s) you might be interested in …