Two Minute Papers: Artistic Style Transfer For Videos #68 •
In this AI video ...
Artificial neural networks were inspired by the human brain and simulate how neurons behave when they are shown a sensory input (e.g. images, sounds, etc). They are known to be excellent tools for image recognition, any many other problems beyond that – they also excel at weather predictions, breast cancer cell mitosis detection, brain image segmentation and toxicity prediction among many others. Deep learning means that we use an artificial neural network with multiple layers, making it even more powerful for more difficult tasks.
This time they have been shown to be apt at reproducing the artistic style of many famous painters, such as Vincent Van Gogh and Pablo Picasso among many others. All the user needs to do is provide an input photograph and a target image from which the artistic style will be learned.
And now, onto the next frontier: transferring artistic style to videos!
Recommended for you:
Deep Neural Network Learns Van Gogh’s Art
Deep Learning Program Learns to Paint: https://www.youtube.com/watch?v=UGAzi1QBVEg
From Doodles To Paintings With Deep Learning: https://www.youtube.com/watch?v=jMZqxfTls-0
Sintel Movie copyright: Blender Foundation https://durian.blender.org/sharing/
The thumbnail background image was taken from the corresponding paper.
Splash screen/thumbnail design: Felícia Fehér – http://felicia.hu
This transcript was generated by an AI at Otter.ai
Dear Fellow Scholars,
this is Two Minute Papers with Károly Zsolnai-Fehér. Here we have previously talked about a technique that used a deep neural network to transfer the artistic style of a painting to any arbitrary image, for instance to a photograph. As always, if you’re not familiar with some of these terms, we have discussed them in previous episodes and links are available in the YouTube description, make sure to check them out.
Style transfer is possible on still images but as there is currently no technique to apply this to videos, it is hopefully abundantly clear that a lot of potential still lies dormant inside. But can we apply this artistic style transfer to videos? Would it work if we would simply try? For an experienced researcher it is blatantly obvious that it’s an understatement to say that it wouldn’t work. It would fail in a spectacular manner, as you can see here, but with this new technique, it apparently works quite well.
To be frank, the results look gorgeous. So how does it work? Now don’t be afraid you’ll be presented with a concise but deliberately obscure statement. This technique preserves temporal coherence when applying the artistic style by incorporating the optical flow of the input video. Now the only question is what temporal coherence and optical flow means. Temporal coherence is a term that was used by physicists to describe for instance, how the behavior of a wave of light changes or stays the same if we observe it at different times. In computer graphics, it is also an important term because oftentimes, we have techniques that we can apply to one image, but not necessarily to a video, because the behavior of the technique changes drastically from frame to frame introducing a disturbing flickering effect that you can see in this video.
Here, we have the same if we do the artistic style transfer. Because there is no communication between the individual images of the video, the technique has no idea that most of the time we’re looking at the same things. And if so their artistic style would have to be applied the same way over and over to these regions, we are clearly lacking temporal coherence. Now, onto optical flows. Imagine a flying drone that takes a series of photographs while hovering and looking around the boss. To write sophisticated navigation algorithms, the drone would have to know which object is which across many of these photographs. If we have slightly turned most of what we see is the same and only a small part of this new image is new information. But the computer doesn’t know that as all it sees is a bunch of pixels. Optical flow algorithms help us achieving this by describing the possible motions that give us photograph B from photograph a in this application.
What this means is that there is some interframe communication the algorithm will know that if I color this person this way a moment ago, I cannot drastically change the style of that region on a whim. It is now easy to see why naively applying such techniques to many individual frames would be a flippant attempt to create beautiful smooth looking videos. So now it hopefully makes a bit more sense. This technique preserves temporal coherence when applying the artistic style by incorporating the optical flow of the input video.
Such great progress in so little time, loving it. Thanks for watching and for your generous support and I’ll see you next time.