Two Minute Papers: Recurrent Neural Network Writes Sentences About Images | Two Minute Papers #23
This technique is a combination of two powerful machine learning algorithms:
– convolutional neural networks are excellent at image classification, i.e., finding out what is seen on an input image,
– recurrent neural networks that are capable of processing a sequence of inputs and outputs, therefore it can create sentences of what is seen on the image.
Combining these two techniques makes it possible for a computer to describe in a sentence what is seen on an input image.
_____________________
The paper “Deep Visual-Semantic Alignments for Generating Image Descriptions” is available here:
http://cs.stanford.edu/people/karpathy/deepimagesent/
A gallery with more results with the same algorithm:
http://cs.stanford.edu/people/karpathy/deepimagesent/generationdemo/
You can train your own convolutional neural network here:
http://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html
The source code for the project is now available here:
https://github.com/karpathy/neuraltalk2
Subscribe if you would like to see more of these! – http://www.youtube.com/subscription_center?add_user=keeroyz
The thumbnail image background was made by Georgie Pauwels (CC BY 2.0) – https://flic.kr/p/qrRciQ
Splash screen/thumbnail design: Felícia Fehér – http://felicia.hu
Károly Zsolnai-Fehér’s links:
Patreon → https://www.patreon.com/TwoMinutePapers
Facebook → https://www.facebook.com/TwoMinutePapers/
Twitter → https://twitter.com/karoly_zsolnai
Web → https://cg.tuwien.ac.at/~zsolnai/