#045 Microsoft's Platform for Reinforcement Learning (Bonsai)

Microsoft has an interesting strategy with their new “autonomous systems” technology also known as Project Bonsai. They want to create an interface to abstract away the complexity and esoterica of deep reinforcement learning. They want to fuse together expert knowledge and artificial intelligence all on one platform, so that complex problems can be decomposed into simpler ones. They want to take machine learning Ph.Ds out of the equation and make autonomous systems engineering look more like a traditional software engineering process. It is an ambitious undertaking, but interesting. Reinforcement learning is extremely difficult (as I cover in the video), and if you don’t have a team of RL Ph.Ds with tech industry experience, you shouldn’t even consider doing it yourself. This is our take on it!

There are 3 chapters in this video;

Chapter 1: Tim’s intro and take on RL being hard, intro to Bonsai and machine teaching
Chapter 2: Interview with Scott Stanfield [recorded Jan 2020]
Chapter 3: Traditional street talk episode [recorded Dec 2020]

This is *not* an official communication from Microsoft, all personal opinions. There is no MS-confidential information in this video.

Dr. Keith Duggar
Dr. Tim Scarfe
Yannic Kilcher

Scott Stanfield

Megan Bloemsma

Gurdeep Pall (he has not validated anything we have said in this video or been involved in the creation of it)

Project Bonsai:

Machine Teaching – A New Paradigm for Building Machine Learning Systems
Patrice Y. Simard et al

00:00:00 CHAPTER 1: Show kick off
00:03:30 How Bonsai came to be
00:04:27 Teaser
00:06:05 Deep RL doesnt work yet (Alex Irpan article)
00:23:33 Is Deep RL “intelligent”? What would Francois say?
00:29:54 Machine Teaching
00:39:10 Interface piece
00:42:05 MOAB demonstration
00:45:23 Bonsai – how it works
00:49:35 Demo
00:52:42 Dont need data scientists anymore
00:56:41 CHAPTER 2: Scott Interview
00:57:30 Machine Teaching
01:02:34 Project MOAB
01:05:04 How does Bonsai work behind the scenes
01:08:08 Does it work with non-regular processes?
01:11:06 The world of simulators
01:12:52 Training process / machine teaching
01:18:40 Relate it to normal RL?
01:21:42 Value prop of Bonsai?
01:25:00 Machine teaching again
01:30:48 Some examples of positice reinforcement
01:35:43 Tim take on bonsai
01:38:13 CHAPTER 3: Street Talk Team check out Project MOAB
01:40:32 Model free RL
01:43:07 How did we train this on Bonsai?
01:44:19 Who is the target customer for Bonsai?
01:46:44 Why not just use normal RL?
01:49:02 Machine Teaching
02:13:41 The Man-Machine Mind Meld
02:19:38 Bonsai is the data bricks of RL
02:21:36 Yannic leaves and 2 balls
02:22:17 Consulting arrangement
02:22:51 Generalizing to two balls, Bias variance trade off
02:27:43 Outro, MOAB looks cool!

Deep Reinforcement Learning Doesn’t Work Yet (Alex Irpan)

Pod version: https://anchor.fm/machinelearningstreettalk/episodes/045-Microsofts-Platform-for-Reinforcement-Learning-Bonsai-er84to

YouTube Source for this AI Video

AI video(s) you might be interested in …