Machine Learning Discussion with Daphne Koller and Yoshua Bengio •

In this AI video ...

First of all, let me introduce Daphne Koller. It’s really a great honor for me to introduce her today, for you. She’s, as many of you know, a machine learning star, but she’s also an incredible achiever. Recipient of numerous awards, nominated among the most important people, most influential people and game changers by the likes of Newsweek and Time Magazine. As many of you know, also, she co-founded Coursera, and she wrote the book on probabilistic graphical models. And finally, she wants to take advantage of machine learning to revolutionize healthcare. Let us welcome her. So, Daphne, you have a very interesting and unusual trajectory in your career. So, maybe you can start by telling us a little bit about it. Okay. I started out like, I think many of you in this audience, just loving machine learning and both in terms of the technical rigor that it allows and in terms of what it can accomplish. And I was doing that for a long time and thought that that would be where I would continue my career. Over time, I became more and more interested, not just in the mental gymnastics of machine learning, but also in actually implementing change in the world using machine learning techniques. And so, I became more applied over time. I started to work in computational biology and computational health long before it became popular. So, back in 1999, 2000, initially it was because those were just cool data sets that had richer structure and more interesting problems than… what we had to deal with back then, which was pre-emnist. And then, over time, I just became excited about the opportunities that it’s opened up in terms of making impact on the world and helping us understand fundamental problems in human health. At the same time, I also became, I think, somewhat less optimistic about the ability to affect meaningful change in terms of reaching actual patients from within academia. I think it’s maybe a little bit different now than it was back then, although it’s still far from perfect. But the ability to get companies out there to take ideas that are developed in academia and really see them through to the point that they actually make their way into product is very difficult unless you transition people along with the ideas. You don’t just ship people a paper and expect that to work. And I saw that most clearly in the work that I did with my former PhD student and the back who then went on to become a professor at Harvard. We wrote what was, I think, arguably the first digital pathology paper. This was back in the days, pre-deep learning. I know it’s hard for many of you to imagine that pre-deep learning existed. It was, I mean, it existed, but it wasn’t the, you know, where it is today. And so we were using standard computer vision techniques to look at computational, to look at images of breast cancer pathology, microscopic images. And we were able to show even with the relatively simplistic techniques that we were employing that if you took a completely data-driven approach that didn’t have any preconceptions about what was important, and it wasn’t, and didn’t just look at the stuff that pathologists were looking at, that two things came out of that. First of all, is that we were able to diagnose much better than your average pathologist by a fairly significant margin. And second was that the features that pathologists had been looking at were really not the right ones. They were looking at the features of the tumor cells, because naturally that made sense. That’s where the action was. Turns out that the environment that surrounded the tumor cells was actually even more indicative of what was going to happen to the patient than the tumor itself. Now today I think a lot of people recognize that, hence all the conversation about what we now call tumor microenvironment, but back then it was a fairly new thought. So we said, wow, this is really cool, really important, and cancer patients are getting misdiagnosed every day. So let’s take those ideas and try and convince someone to bring that into clinical practice, and it just didn’t work. And then Andy went off and became a professor at Harvard and tried to do it there, and it didn’t work either. So why was it so difficult? It’s hard. There needs to be an internal champion within a company to take something like that into practice, and a lot of companies have siloed interest, and they have their own trajectory, and they’re not necessarily looking for new ideas from the outside. Andy, by the way, eventually gave up and ended up leaving Harvard and forming his own company to try and do that himself because he realized that was really the only way to get it to practice. So, anyway, I’m going to pause because I think we’re getting it some of the other topics later. That’s fine. So actually related to what you’re talking about, on the one hand, you’re saying that it’s been difficult for researchers and machine learning researchers in academia to have an impact in healthcare. Can you tell us a little bit from the other side, like big farm, what do you think about their chances of taking advantage of machine learning? And you want me to do this on camera? Yes. We need to name people. We’re not going to name anybody. I think that there is a significant cultural barrier to success in general in the integration of machine learning and the life sciences, and it’s even more predominant in most big, formal companies. These are two communities with extremely different ways of thinking, very different vocabularies, and a huge barrier to entry on either side in terms of just the amount of knowledge that you need to acquire. So even if you take two groups of people, biologists, and machine learning people, and they’re both extremely eager to collaborate, you put them in the room and tell them to work together, they might as well be talking times to a heli to each other. It’s just very different mindset. So that’s already a problem, even if everyone is already with the goal of collaboration. At the top of that, that most pharma companies are not actually geared for that. There is the science silo of people sitting there coming up with scientific experiments based on whatever hypothesis they have. They generated data set rarely in consultation with anyone computational, then they throw it over the wall to the bioinformatics group, because they please analyze this for me and send me back the results. That’s not a culture that encourages the kind of experimental design that really drives forward innovative machine learning techniques. And I think that is a cultural and structural barrier that I’m not going to say is impossible to overcome, but it’s certainly going to be difficult. So speaking of this gap between biological research and development, on one hand, and machine learning researchers, let’s say I have some PhD students who are interested in machine learning to healthcare and biology, what would be your advice for them? So first of all, I’m glad that there are such students because we need more of them. The number of people who are bilingual in these two disciplines is incredibly small and they are incredibly valuable. Because if you’re going to form an organization that actually does both and applies machine learning to things other than, you know, advertising and e-commerce and image recognition to something that I think has really fundamental implications to human health. You do need people who sit in the middle and speak both languages. So I would say to your students, really try and learn the vocabulary of the other side, try and take some classes, read some papers. And really important, as in any interdisciplinary area, is first of all to treat the other side with respect. They think differently from you. It doesn’t mean that they’re less intelligent. It just means they have a different set of techniques and a different set of ideas that they bring to the table. So be open-minded to listen to their perspective. Don’t be afraid to sound stupid by asking stupid questions because you bring your own expertise to the table. So it’s okay for you to sound stupid when you don’t know the stuff that the other side knows. And I’ve asked so many stupid questions of biologists as ridiculous. I think it’s okay. But really treating both sides with respect is important. So if you come into the conversation with, we’re really smart machine learning people. We’re going to replace you. That’s a very bad starting point for the hellhug. Yeah. So related to my previous question, I guess if you get those questions and I get those questions, I’m curious to know how you answer those young people who are coming to us and asking, you know, how should I choose my career path? What research directions should I be looking at? What would you tell them? Yeah. So I can tell you what guide, what has guided me for many, many years, which is how much impact can I bring to the world? So when I look back on my life, you know, whenever that is, I want to know that I’ve left the world a better place for my being there. And for me, and it’s different for every person. For me, it’s not just how much value I’ve brought, but even more so how much value I bring relative to what would happen if you were to take me out and put in someone else. So if I do something and it might be valuable, but there’s 500 other people who could do that same thing as well or better, then maybe that’s not the ideal thing for me to be doing. What I want to do is things where I can bring in the maximum value in a somewhat unique way. And in some ways, that’s what keeps drawing me back to this intersection between biology, slash health, and machine learning, because as we just discussed, the number of people who are genuinely bilingual in these disciplines is a very, very small number. And yet it is a crying need to bring machine learning and big data techniques to a field that really hasn’t benefited from them very much up until now. And so I think that’s sort of a guiding principle. And I think the other part of that is don’t be afraid to do something big. If I, people often ask me, what do you regret about your career? And I think the one thing I maybe regret the most is not having tried to do something as big as Coursera earlier in my career. And I think that’s something that just take on a really big important challenge. Very few people regret having tried something and failed. A lot of people regret not having tried. So tell us more about Coursera. Like, you know, what you’re most proud of or the things that you would have wished be done differently. So Coursera kind of came on me unexpectedly. It was never part of the sort of charted career path that I had. I’d always had education as kind of like a side interest of mine. You’re not supposed to really care about education as a professor. So I always done this kind of on the side. And then the work that I’ve done on trying to make education at Stanford better using technology suddenly became together with work that had been done by colleagues like Andrew Aing and Sebastian Thrun and others kind of coalesced into this eruption of the MOOCs back in the fall of 2011, the massive open online courses. And we saw that each of those courses that we just launched as an experiment had 100,000 people in them or more. And not just numbers. It’s like these are people from every country, every age group and every walk of life and people who would never have access to a Stanford quality education using the opportunities that were available to them. So I kind of said, okay, well, I have to see this through. I can just assume that someone will take this on and make this happen. Remember what I talked about in terms of the pathology. So this is the only way to do transfer is via warm bodies. I sacrifice what was then my research agenda, put it on hold for what was supposed to be a two year leave of absence from Stanford and went together with Andrew founded Coursera. Two years came to an end and I decided that I really needed to stay at Coursera longer to see it through because it wasn’t quite ready for me to leave. So I ended up having to resign my Stanford position and stay at Coursera until the summer of 2016. So what I’m the most proud of, I’m going to start with the other side, what I wish I’d done differently and I’ll end up behind note. The stuff that I wish I’d done differently is I’d never had a start up. I’d never been an industry. I’d been an academic my entire life. I knew nothing. It was, it’s almost pathetic today to think about all the mistakes that we made at the beginning of the company. And I wish I had recruited someone more senior who had been there done that non-cerly more senior someone who had more experience with industry and startups and would help have helped us avoid a lot of mistakes that we made at the beginning. So higher if you’re doing a startup higher people who complement you in skillset. The thing that I’m the most proud of are the lives that we’ve touched. Now I know many of you here in the audience have been touched by Coursera and or other online education opportunities and maybe were drawn to machine learning by having had that. I’m proud of you or had proud of having enabled that. But I’m even more proud of the people who are not in this room and will never be in this room. Like this woman in Bangladesh who ran away with a friend because her friend was about to be sold into invented servitude which happens a lot in Bangladesh. And they started a bakery but the bakery wasn’t making it because neither of them knew how to run a business. And she found out through Coursera how to run a business and she took classes from Michigan and and Wharton and others and learn how to make her bakery success to the point that it has now employing ten people. Many of whom would have been sold and women most of whom would have been sold into invented servitude. Every week at Coursera at all hands we read a learner story. A lot of those stories were like this one from Bangladesh or a man from Nigeria or a disabled boy from the United States who would never have had an opportunity to have a traditional educational experience and we transformed lives. And that is the thing that I’m the proudest of. Speaking of helping women, how has it been for you as a woman researcher and also in industry given the male world in which we lived? We’ve been hearing a lot about this recently and it’s really amazing to see both in a positive and a sad way to see this coming out. Personally I have to say I’ve been one of the lucky ones. I haven’t experienced some of the more egregious of the behaviors that we’re hearing about right now. Maybe because when I started out in machine learning it was a much smaller community and there wasn’t quite as much of the behavior that we’re hearing about I don’t know but at least I can say for myself I’ve been fortunate. However a lot of the other aspects that we don’t talk about very much yet because those sexual harassment and abuse are so much more important to take care of first. A lot of those other things I’ve experienced all the time. So those little subtle sort of insults and derogatory comments and sort of being relegated as a second class citizen happens to women all the time at all levels. So I’ll give one example I could stand here for this entire talk and give nothing but examples of this but I’ll give one of my favorites is when I was at Coursera and I was introduced to the CEO of a Fortune 500 company very big and it was clearly introduced as Daphne Collar, co-founder, CEO of Coursera. And I replied saying it’s very nice to meet you I look forward to our meeting and my assistant James will be in touch to coordinate a meeting and the response back from his assistant was dear Daphne can you confirm James’s availability for this and that date. And it’s not funny but it’s true and it happened all the time it happened to me at Stanford with male colleagues I would be going to a meeting with the male colleague and I would be asked to confirm scheduling. We would constantly have emails referred to myself until male colleague as dear professor so and so and Daphne and it’s like I don’t mind being called Daphne but the contrast was drawing. And then the other aspect of course is when you’re sitting in a room full of people all of whom besides you are male and you say something and no one pays attention and like three minutes later a male colleague says the exact same things like John what a great idea. And what do you say I said that three minutes ago it sounds terrible right it sounds like you’re constantly trying to get credit or do you let it go that’s the other option that’s not great either and it’s just there’s no good answer in these situations and it happens for those of you in the audience or women how many of that happened to you. Not many women unfortunately speaking of which they’re not enough women here yeah do you have any suggestion of how we can move things in a better direction and I don’t know maybe reflecting also on what you’ve seen at Stanford or in other places that might have worked. I think one of the things that could help let me say two things that could help one is when you see some of the behaviors that I talked about I’m not just talking about the super egregious ones like the sexual harassment and the groping but even those subtle things speak up it’s really hard for the woman to speak up but it’s a lot harder it’s a lot easier for you. So say something say you know what Jane said that three minutes ago and I think it’s a really terrific idea it’s a lot easier for you to do that. The second thing is I think partly as a community we’ve moved into an area that becomes much more about I’m hoping not to insult anyone but narrow papers that have a lot of really cool mental gymnastics in them. The cool new algorithm, the cool new technique and those are important but sometimes there’s other papers that are not don’t have quite as much cool mental gymnastics but solve a really important problem maybe using simple ideas. I’d say the other someone else has already come up with and you just use them in a novel and interesting way but you solve a problem that society is important women tend to get drawn I’m over generalizing but I saw that at Stanford to problems that are society meaningful and sometimes those problems don’t call for the fanciest solution and trying to force a fancy solution on them isn’t the right approach. Meaning or devaluing those papers because they don’t have that I think doesn’t benefit us as a community and tends to also turn away people of women are maybe a larger fraction than in other parts of the space then to turn away people who care about solving real problems rather than just coming up with the coolest thing. And so I think as a community if we were more open minded on some of these papers that just really solve important problems maybe that would be more welcoming to a broader subset of people. So let me return a little bit to the technical side what do you consider to be exciting directions of research for machine learning people such as we have here. Maybe that are overlooked or that you would find exciting. So I work in the area of biology and health and I think that area has many amazing opportunities but it’s characterized by a set of problems that are I think in many ways less of a focus to a lot of the work that’s being done here. The data sets there even the larger ones are not large by machine learning standards really large ones have a few hundred thousand samples. A typical one might have ten thousand like TCGA for instance is one of the larger cancer data sets out there has data from ten thousand individuals multiple data modalities multiple measurements for each cancer. I think about the kind of challenges that that data set provides they’re very different from when you’re dealing with you know a bunch of images. First of all there is the heterogeneity of the samples each of those was collected from a different hospital often with a different assay that’s an issue. It’s not very large so you need to figure out how to employ techniques such as multi task learning or transfer learning or zero shot learning that people are starting to play around a little bit with but are not as common places more standard ones. There’s the opportunity to bring in I know that’s considered suspect these days but prior knowledge into the models to some extent because you have to compensate for the lack of availability of hundreds of thousands of samples with something else that gives you power. So these are directions that I think are really important to think about artifacts and batch effects we’ve heard about that in the context of some of the fairness work that’s been done here but it also comes up in these other applications small data sets and so on. I just think our important directions and so maybe as a meta answer to your question Yoshua is if we all stopped looking at the exact same data sets that everyone else has looked at a whole new set of challenges will immediately emerge and jump out at you as things that we should be doing. So tell me about your views on the particular machine and problem of interpretability. I think interpretability is a nuanced question and one needs to instead of jumping all over everything needs to be interpretable or nothing needs to be interpretable it’s very much a question you know a data set by data set and problem by problem. So if you’re looking to get like absolutely the maximal performance on whatever image recognition probably matters less whether your model is interpretable but if your goal is to work with scientists for instance and really help do basic biolog basic scientific discovery. The predictions are in many cases a byproduct of understanding what it is that the model is telling you about the underlying biology and then there’s a lot of things in between where you might care about this but only to a certain extent so for instance you don’t want to just again jump to the conclusion everything in science requires the model to be interpretable. So for instance if you’re trying to just make really good prediction about whether a molecule is going to bind to another molecule might be interesting to know why it’s fundamentally you just care about making good predictions so you really want to think about your problem and to what extent interpretability is called for and then design your model accordingly as opposed to trying to over generalize the ones either the other. So I do deep learning one of the first before deep learning was deep learning and so with deep learning becoming more and more important what do you think about the prospect for previous approaches that have been popular in machine learning what do you see for the future. I think it’s really important to avoid a mindset of I have an amazing hammer so everything must be a nail. I think deep learning has demonstrated value way beyond any of us certainly myself would have anticipated and it’s a very powerful tool for solving a certain set of problems. I don’t think sorry it’s the solution for every problem and again I think a thoughtful approach is what am I trying to solve and to what extent is deep learning the right solution to what extent is a pgm the right solution maybe there’s an interesting hybrid that makes sense. I think and I mentioned this earlier that the solution because of the success of deep learning in cases where there’s large data sets available has maybe over the pendulum has swung too far in my opinion towards models that are largely knowledge free. And maybe there’s a class of problems were models that have both data and knowledge in them are a good solution for certain set of problems so again thinking outside the box outside the current box and looking to and looking at new data sets that might pose a call for a different set of solutions than the ones that are currently the standard I think is something that would benefit the community. So there’s one area which sits somewhere near the intersection of graphical models and also interpretability and also well deep learning I think in plain important role and that’s causality do you have something to say about this. Because it’s actually really important it’s certainly important when you think about biology and the problems that it tackles in terms of if most of those problems are interventional if I give this drug to a patient will they become better that is a causal question intrinsically it’s not the question is very different from the question of this drug was given to this set of people did they become better that is really good. That is really not a causal question because there’s so many confounders in that in terms of the doctor who decided to prescribe this medication the the extent to which the patient stuck to the regimen compliance and so there’s just so many artifacts on that the population of people who got access to doctors who prescribed that the sort of more efficacious medication maybe they’re just the ones who are better off to begin with because they’re richer and better and more privileged. There’s just so many confounders that you’ll never be able to correct for using a purely observational analysis so I think that’s an area maybe is another important answer to your previous question the community hasn’t devoted enough energy to. Thank you. I would like now to open the floor for questions so there’ll be usual people with microphones. Yes here. Thank you. If you could I’ll use the term wave a magic wand to actually expect a biomedical institution to sort of meet the machine learning community at least in the middle what what suggestions would you have. I think that’s a terrific question I’m actually glad you asked that because one of the things I didn’t get to talk about with your shows what I’m doing next and I think one of the things that really has hampered progress is that scientists design data sets and experiments to their purpose and those are often very good purposes but it doesn’t necessarily serve the needs of building really good machine learning models that might solve problems in the completely different way might. Come at the problem in a completely different way so I think what is really valuable to take those two groups put them in the room and say what are the really important problems that you wish you had a magic wand and would like to solve not what is the problem that you’re solving today but what are the really big questions that you’d like. And then if we can come up with one if there is confidence or some reason to believe that machine learning might be the right tool for that type of question which it may or may not be it you know machine learning to is not the magic pull it for everything but if there is let’s think about how to design data sets that will allow the machine learning approaches to be trained effectively and applied effectively and this is very different from I’m going to produce a data you please analyze it for me. And so I think getting those two groups to work together as equal partners is an absolutely critical step towards using machine learning effectively in the biology and health fields. Can you tell us a bit more about your new venture. Thank you. So yesterday as it happens, actually it was coincidental, I want the company that I’m very proud of because it’s trying to do exactly that. It tries to you know in a specific area of drug discovery and development to take life scientists and machine learning people and put them together in the room and say what are the really big questions in drug discovery that if we had a magic wand we wish someone could solve for us efficiently is machine learning the right solution for that. And if so, one of the things that biologists have done amazingly in the last five years is design tools that allow us to produce data at scale that was not the case of five or seven years ago when I went to Coursera a large data set at that point was a couple hundred samples. Now biologists by virtue of miniaturization and microfluidics and robotics and automation and better cellular models and things like crisper and various measurement assays are able to create really really large data sets at scale. But no one’s using that for creating data from machine learning, but you could. So how does that compare with the traditional way of drug development. The traditional way of drug development is by and large very hypothesis driven again I’m over generalizing and I know there will be someone who says but here we do it differently and that’s starting to happen more and more. But the way typically happens is here’s a disease that I think has a large market size and I think and large on meth need. So here is the pathway that I’ve read in papers seems to be implicated in those in that disease and within that here is the target that I think is most plausible or most drugable or whatever. It’s not going to devise a small molecule or a large molecule that binds to that target. It’s very hypothesis driven very intuitive very much based on biological knowledge, not very much based on unbiased data driven exploration of a much broader set of hypotheses which creates this weird inverted funnel which is that most of that in this early exploratory phase relatively a small portion of the funding is is devoted to this place where you have a fairly large leverage of what do I pursue and by far the bulk of the money that spends on drug development is spent on phase three clinical trials by which point you’re pretty much just praying that whatever it is you devise actually works. What if you inverted things a little bit and put more money in the upfront exploration phase so that you might increase the success of that phase three clinical trial. Do you think we could reduce the time to having drugs that available to people that’s the hope that’s the hope. But I mean how does that impact the like I heard about numbers like 10 15 years. So some things can’t be heard unfortunately human biology is what it is and it doesn’t really much matter how much you want it to go faster if you’re looking to see does a drug stop or reduce the extent to which you get heart attacks or does it slow cognitive decline and dementia. It takes five years for that to manifest and you can’t speed that along. But there are certain things that you can speed a lot of the earlier pre clinical discovery still takes seven or eight years and I think those are places where you can have savings I think even the clinical trials can be potentially sped up in a variety of ways so for instance if we had the right some of the reasons why it takes along to clinical trials it takes a long time to recruit a large population as you would need to get convincing statistical evidence that the process of something is efficacious. But what happens if we knew up front who are the right people for that particular clinical trial and you were able to target and recruit a much smaller population and have a much larger effect size manifest hopefully over a shorter time frame. So again I don’t think you can completely shorten that 15 years to we’re going to get it done in two years that’s just a unfortunate a ridiculous hope. This is where you can shave off here here a couple years there and maybe reduce it down from 15 to five or seven. That would already be amazing. All right on this let us thank Daphne Kohler and really I’m super happy to have you today here and let’s welcome well let’s say thank you. Thank you.

AI video(s) you might be interested in …