# #23 Machine Learning Specialization [Course 1, Week 2, Lesson 1]

I remember when I first learned about vectorization, I spent many hours on my computer taking an unvectorized version of an algorithm running it, see how long I ran, and then running a vectorized version of the code and seeing how much faster that ran. And I just spent hours playing with that, and it frankly blew my mind that the same algorithm vectorized would run so much faster. It felt almost like a magic trick to me. In this video, let’s figure out how this magic trick really works. Let’s take a deeper look at how a vectorized implementation may work on your computer behind the scenes. Let’s look at this for loop. The for loop like this runs without vectorization, so if j ranges from 0 to say 15, this piece of code performs operations one after another. On the first time step, which I’m going to write as time 0 or t0, it first operates on the values at index 0. At the next time step, it calculates values corresponding to index 1, and so on until the fifth theme step, where it computes that. In other words, it calculates these computations one step at a time, one step after another. In contrast, this function in NumPy is implemented in the computer hardware with vectorization. So the computer can get all values of the vectors w and x, and in a single step, it multiplies each pair of w and x with each other all at the same time in parallel. Then after that, the computer takes these 16 numbers and uses specialized hardware to add them all together very efficiently, rather than needing to carry out distinct additions one after another to add up these 16 numbers. This means that code with vectorization can perform calculations in much less time than code without vectorization. And this matters more when you’re running learning algorithms on large datasets or trying to train large models, which is often the case with machine learning. So that’s why being with the right vectorized implementations of learning algorithms has been a key step to getting learning algorithms to run efficiently and therefore scale well to the large datasets that many modern machine learning algorithms now have to operate on. Now let’s take a look at a concrete example of how this helps with implementing multiple linear regression, that is linear regression with multiple input features. Say you have a problem with 16 features and 16 parameters w1 through w16 in addition to the parameter b. You calculated 16 derivative terms for these 16 weights and in code maybe you stored the values of W and D in two non-pire rates with D storing the values of the derivatives. For this example I’m just going to ignore the parameter b. Now you want to compute and update for each of these 16 parameters. So wj is updated to wj minus the learning rate say 0.1 times dj for j from 1 through 16. In code without vectorization you would be doing something like this. Update w1 to be w1 minus the learning rate 0.1 times d1 next update w2 similarly and so on through w16 updated as w16 minus 0.1 times d16. In codes without vectorization you could use a full loop like this for j in range 016 that again goes from 0 to 15. Said wj equals wj minus 0.1 times dj. In contrast with vectorization you can imagine the computer’s parallel processing hardware like this. It takes all 16 values in the vector w and subtracts in parallel 0.1 times all 16 values in the vector d and assign all 16 calculations back to w all at the same time and all in one step. In code you can implement this as follows. w is assigned to w minus 0.1 times d. Behind the scenes the computer takes these non-pire rays w and d and uses parallel processing hardware to carry out all 16 computations efficiently. So using a vectorized implementation you should get a much more efficient implementation of linear regression. Maybe the speed difference won’t be huge if you have 16 features but if you have thousands of features and perhaps very large training sets this type of vectorized implementation will make a huge difference in the running time of your learning algorithm. It could be the difference between code finishing in one or two minutes versus taking many many hours to do the same thing. In the optional lab that follows this video you see an introduction to one of the most used Python libraries in machine learning which we’ve already touched on in this video called numpy. You see how the tree vectors in code and these vectors or lists of numbers are called numpy arrays and you also see how to take the dot product of two vectors using a numpy function called dot. And you also get to see how vectorized code such as using the dot function can run much faster than a for loop. In fact you get to time disco yourself and hopefully see it run much faster. This optional lab introduces a fair amount of new numpy syntax. So when you read through the optional lab please don’t feel like you have to understand all the code right away but you can save this notebook and use it as a reference to look at when you’re working with data stored in numpy arrays. So congrats on finishing this video on vectorization. You’ve learned one of the most important and useful techniques in implementing machine learning algorithms. In the next video we’ll put the math of multiple linear regression together with vectorization so that you will implement gradient descent for multiple linear regression with vectorization. Let’s go on to the next video.