11-785 Spring 2023 Recitation 0B: Fundamentals of NumPy (Part 1/8)
Hi everyone, my name is Percurthy and I’m one of the TAs this spring for 11785 Introduction to deep learning. I will be going through the fundamentals of Numpy in this recitation. This recitation is going to cover what Numpy is installing Numpy, initializing data, accessing and modifying data using Numpy, pivoting data, combining data, and math operations. So firstly, what is Numpy? So Numpy is a packet for scientific computing in Python and it’s a Python library that provides a variety of operations for quick computations on arrays. So these operations range from basic logical operations to linear algebra, random simulations, among other things. We will be using a lot of Numpy in this course, so it’s important for all of you to have a good grasp of it. So now we’re going to talk about installing Numpy. So generally Numpy is pre-installed on colab and AWS. So it’s important for us to check whether Numpy is available first and what version is available. The version can be checked using the command PIP Show Numpy. And if you don’t have Numpy installed on your system, you can install Numpy using the command PIP install Numpy. Once you’ve installed Numpy, it can be imported using the command importNumpyFNP, where NP is the commonly used alias for Numpy. So moving on to initialization. We’re going to first talk about intrinsic Numpy array creation functions and more specifically one-day array creation functions. So firstly, we have the NP.arrange function. This function returns an evenly spaced array within a given interval. So if I specify the input parameter, which in this case we specify to be 10, the output array is going to be an array from 0 to 9 of 10 values. We also have the NP.lin space function. This function returns even an array of evenly spaced numbers over a specified interval. So in this case, we have our starting value to be 2 and our ending value to be 3. And we also want 5 elements within our array. We also have another parameter called endpoint, which lets us specify whether we want the endpoint to be included in our output array or not. So in this case, since we’ve set endpoint to false, our output array has values from 2 to 2.8 excluding 3 and has 5 elements in the array in total. So we now move on to some general MD array creation functions. So we start with numpy.md, which is used to create an empty array. We can specify the number of rows and the number of columns that we want our output array to be by specifying the input parameters that we desire. So in this case, we have created a matrix with 2 rows and 2 columns. We also have the numpy.0 function, which creates a 0’s array of a given dimension and the numpy.1 function, which creates a 1’s array of a given dimension. In our case, we have our output array to be an array of all 0’s with 2 rows and 3 columns and our output 1’s array to be an array of all 1’s with 4 rows and 2 columns. So we now move on to the numpy.0’s underscore like function. So this function is used to create a 0’s array without specifying the size of the array explicitly. Instead of specifying the size, we pass another array as a parameter to this function. Numpy automatically references the dimensions of this passed array. So in this case, the ones error that we pass in had shape 4 comma 2 and therefore the output array 0’s like error is going to have the same shape 4 comma 2 with 4 rows and 2 columns. And it’s going to have all 0’s. Similarly, the numpy.1’s underscore like function returns an array of 1’s with the same shape and type as the input parameter. Since our 0’s underscore error function had dimensions 2 comma 3, we see that our output 1’s like error function has all 1’s with 2 rows and 3 columns. We also have the numpy.ful function, which is used to create an array of a given dimension and fill that array with a particular value. In this case, we have specified that we want the output size to be 2 rows and 2 columns and we want our fill value to be 10. Therefore, this is the output that we get. And similarly, the numpy.ful underscore like function takes similar parameters, except we can specify the new value that we want the array that we pass in to be filled with. So our 0’s underscore error function initially had all 0’s. And so since we specified our fill value to be 0.1, the output array will have the same dimensions as the original 0’s underscore error function, 2 rows and 3 columns. But now it’s filled with 0.1. We can also specify the type that we want our output array to be using the d type parameter. And in this case, we want it to be filled with doubles. So we can also create an array from existing data and more specifically, can convert from other Python structures like lists and 2 pulls. So suppose we have a list containing 4 elements and we want to typecast this list to a numpy array. We can achieve this using the np.eray function and by passing this list as an input parameter to this function, this will automatically convert this list into a numpy array and stored in the variable error from list. Numpy can also be used to load data from disk. And we can do this using two different functions. We’re first going to look at the np.loadtxt function where we can load data from a txt file and specify the path to the txt file as an input parameter. We also have the np.load function, which initializes an array by loading data from a numpy file and we can specify the path to the numpy file as an input parameter as well. So we now move on to the use a special library functions like random. We can create an array with random values using the numpy.random.random function, which takes a start value in an n value and a size parameter as well. Therefore, our output array is going to have a shape 1, 4 and it’s going to be filled with random values from 0 to 10. One thing to note in this situation is that every time we do run this random.random function, we end up getting a different output. So for debugging purposes, it becomes difficult for us to understand how or where our program isn’t working properly. Therefore, it is advisable to set a seed and the seed lets numpy create a random set of numbers, which always stays the same over different iterations. Therefore, if I set the seed as 0 here and I run this code two times or any number of times, I will get the same output array. Now the numpy.random.rand function is also used to sample value. It’s also used to draw samples from a uniform distribution over 0 to 1. The two over here specifies the dimensions of the output array, which indicates that we want an output array with three rows and two columns as seen over here. We can also utilize the np.random.random function to sample values from a Gaussian distribution. We simply need to specify the mean and the standard deviation and typecast the entire mean and standard deviation along with the uniform distribution obtained to a Gaussian distribution. So in this case, we are adding the mean to the standard deviation sigma and multiplying this by the array from the uniform distribution. In this case, since we want a 2 by 4 matrix to be our output Gaussian matrix size, we specify that our np.random.random function should have an input parameters 2 and 4. Therefore, once we multiply this by sigma and we add new, our output array is of the Gaussian distribution.