Two Minute Papers: Stable Diffusion Is Getting Outrageously Good!
Dear Fellow Scholars, this is Two Minute Papers with Dr. Karo Zone Fahir. Today, you will see the power of human ingenuity supercharged by an AI. As we are living the advent of AI-based image generation, we now have several tools that are so easy to use, we just enter a piece of text and out comes a beautiful image. Now, you’re asking, okay Karo, they are easy to use, but for whom? Well, good news. We now have a new solution called stable diffusion, where the model weights and the full source code are available for free for everyone. Finally, we talked a bit about this before, and once again, I cannot overstate how amazing this is. I am completely spellbound by how the community has worked together to bring this project to life, and you fellow scholars just kept on improving it, and it harnesses the greatest asset of humanity, and that is the power of the community working together, and I cannot believe how much stable diffusion has improved in just the last few weeks. Don’t believe it? Well, let’s have a look together, and 10 amazing examples of how the community is already using it. One, today, the image generation works so inexpensively that we don’t even need to necessarily generate our own. We can even look at this amazing repository where we enter the prompt and can find thousands and thousands of generated images for death concept. Yes, even for Napoleon cats, we have thousands of hits. So good. Now, additionally, we can also add a twist to it by photographing something in real life, obtaining a text prompt for it, and bam! It finds similar images that were synthesized by stable diffusion. This is very a generation of sorts, but piggybacking on images that have been synthesized already, therefore we can choose from a large gallery of these works. Two, by using a little trickery and the image-impaining feature, we can now create these amazing infinite zoom images. So good! Three, whenever we build something really cool with Legos, we can now ask stable diffusion to reimagine what it would look like if it were a real object. The results are by no means perfect, but based on what it comes up with, it really seems to understand what is being built here and what its real counterpart would look like. I love it! Four, after generating a flat, 2D image with stable diffusion, with other techniques, we can obtain a depth map which describes how four different objects are from the camera. Now that is something that we’ve seen before. However, now in Adobe’s After Effects, look, we can create this little video with a parallax effect. Absolutely incredible! Five, have a look at this catnite. I love the eyes and all of these gorgeous details on the armor. This image really tells the story, but what is even better is that not only the prompt is available, but also stable diffusion is a free and open source model, so we can pop the hood, reuse the same parameters as the author, and get a reproduction of the very same image. And it is also much easier to edit it this way if we wish to see anything changed. Six, if we are not the most skilled artist, we can draw a really rudimentary owl, handed to the AI, and it will draw the rest of this fine owl. Seven, and if you think the drawing to image example was amazing, now hold onto your papers for this one. This fellow scholar had a crazy idea. Look, these screenshots of old Sierra video games were given to the algorithm, and there is no way, right? Well, let’s see. Oh wow! Look at that! The results are absolutely incredible. I love how closely it follows the framing and the mood of the original photos. I have to be honest, some of these feel good to go as they are. What a time to be alive! Eight, with these new web apps, variant generation is now way easier and faster than before. It is now as simple as dropping in an image. By the way, a link to each of these is available in the video description, and their source code is available as well. Nine, in an earlier episode, we had a look at how artists are already using Dolly II in the industry to make a photo of something and miraculously, extended almost infinitely. This is called texture synthesis, and no seems anywhere to be seen. And now, deep fellow scholars, seamless texture generation, is now possible, in stable diffusion too. Not too many years ago, we needed not only a proper handcrafted computer graphics algorithm to even have a fighting chance to create something like this, but implementing a bunch of these techniques was also required because different algorithms did well on different examples. And now, just one tool that can do it all. How cool is that? And 10, stable diffusion itself is also being improved. Oh yes, this new version adds super-resolution to the mix which enables us to synthesize even more details and even higher resolution images with it. This thing is improving so quickly, we can barely keep up with it. So, which one was your favorite? Let me know in the comments below. And once again, this is my favorite type of work, which is free and open for everyone. So, I would like you fellow scholars to also take out your digital ranches and create something new and amazing. Let the experiments begin. This episode is brought to you by AnySkill. The company behind Ray, the fastest growing open source framework for scalable AI and scalable Python. Thousands of organizations use Ray, including open AI, Uber, Amazon, Spotify, Netflix, and more. Ray, less developers, iterate faster by providing common infrastructure for scaling data in just and pre-processing, machine learning training, deep learning, hyperparameter tuning, model serving, and more. All while integrating seamlessly with the rest of the machine learning ecosystem. AnySkill is a fully managed Ray platform that allows teams to bring products to market faster by eliminating the need to manage infrastructure and by enabling new AI capabilities. Ray and AnySkill can do recommendation systems, time series forecasting, document understanding, image processing, industrial automation, and more. Go to AnySkill.com slash peepers and try it out today. Our thanks to AnySkill for helping us make better videos for you. Thanks for watching and for your generous support, and I’ll see you next time.