11-785 Spring 2023 Recitation 0G: Debugging and Visualisation (Part 1/3)
Hi everyone, my name is Qing. Welcome to Recipes 0G, debugging of Deep Neural Network. In this Recipes, we’ll first look at what are some common scenarios in writing Deep Neural Networks and then go through some general coding tips. After that, we’ll cover three types of common issues in writing Deep Learning codes. Coding error, time issue, and memory issue. Hopefully at the end of this presentation, you are able to identify the type of error you’re facing while debugging and understand what are the things you should try in order to find the bugs in your code. So firstly, let’s look at some common first reading scenarios that everyone will probably face sometimes in the future. So your code throws an error and stops running and you don’t understand the lengthy error message or your code runs, but the accuracy is terribly low and it’s not improving. Or your model is taking forever to train and you can’t finish training before the deadline. Don’t worry, this translation will tell you exactly what to do in each of these scenarios. Firstly, let’s go through some good coding practice that every machine learning engineer should know. Tip number one, consolidate all the hyperpermeters in a config dictionary. So think about training deep learning model as a science experiment and all the hyperpermeters are your independent variables. So in science experiments, we first draw a table to record all the experiment results before actually start doing the experiment. In deep learning experiment, we do the same. By putting everything together in a config dictionary, we can easily change and track the hyperpermeter combinations. Tip number two, just like all coding classes will teach you, debut by writing a small task case. The code for deep learning is really very long and there might be hundreds of lines of code in each homework. You should break things down into small functions and test each function individually. When writing a sanity check for your function, the first thing you should do is to print the type and shape of important variables. Unlike debugging lead code problems in which you can print the whole variable, the variables in deep learning models are often too high-dimensional arrays and too big for human inspections. So we first check if they are in the right shape and in order to impact whether the data in the array makes sense, we only print a small segment of it to check. Furthermore, sometimes even a small segment of data doesn’t make sense to us if we are just staring at the raw data. For example, if the data should be a matrix of size 100 by 100 representing a black and Y image, we shouldn’t know if it’s the right image just by staring at the numbers. Hence, we’ll teach you how to use MATP.lib to visualize this kind of data later in this recitation. Also, if there’s a loop in your function, try to break and print after one iteration before things get too complicated. Tip number three, use a debug flag. A debug flag is a switch between debugging mode and training mode. So imagine that during debugging, you wrote 50 print statements every word to monitor the variables. Without a debugging flag, you have to manually delete all of them during training. Otherwise, the training will be super, super slow. And later, if you want to make a slight change to your code, you might need to rewrite all the print statements again for debugging. Hence, to avoid the trouble, we can define a bullet debug flag as shown here. So when you’re not debugging, set it to false. And when you are debugging, set it to truth, and it will print all the necessary information for you. Tip number four, if you are running a code in Jupyter notebook or Google colab, since you can run individual cells in whatever order you want, sometimes you’re accidentally run a cell twice or run the cells in the wrong order. This type of error is hard to realize because technically, there’s nothing wrong with your code. So one thing you should try every time you face an issue is to restart the kernel and just rerun everything in order from the beginning. Now, in this notebook, we’ll demonstrate to you how to write sanity check and how to visualize image data set and speech data set. So as we just mentioned, writing a data sanity check involves printing the type of data, the shape of data, and a few instance of the data. First, let’s go through the example of visualizing image data. Here, we use the built the fashion amnest data set that comes with a torch vision, and we can load the data set by using this default function. So fashion amnest data set is a data set of images consisting of 60,000 training examples and 10,000 test examples. Each example comprises 28 by 28 grayscale image and associated label from one of 10 classes. So as you can infer from the name fashion, the images are about types of cloth. And now we want to write a sanity check to see whether the data loader is working properly. So first thing we can check is whether the samples are 28 by 28 grayscale image and secondly, whether the labels match the images. So here, we can see that the shape of data loaded is 1 by 28 by 28, which makes sense. And also, you can see that the value of the label loaded is 9. This also makes sense because there are 10 types in total and each label is one of the numbers from 0 to 9. And 9 is in the range. So for now, this our sanity is intact. And however, we still don’t know whether the label matches the image correctly. So if we were to just print the image data, it’s like a 1 by 28 by 28 array. So you’ll see this. And yeah, it’s very clear that you can’t tell what kind of clothes this is. So what we want to do is to use matte plot lip, which is a very useful Python library to plot graphs. So here, by running plot dot, I am sure you’re able to print this 10th or 2 to image. And you can see it here. So it’s a boot. And also to check whether the image match the label, you need a labels map, which often comes with the data set. So this labels map matches the label number into a type of clothes. And then it makes sense to us. So here, you’ll see that 9 corresponds to ankle boot. So this is a correct implementation of the data set. So here we leave you with an exercise. I’m trying to use matte plot lip to visualize the 10th image in the data set. And it also is label. So what kind of clothes is it? Now for the second example, we’ll be visualizing speech spectrum. So in this class, other than image recognition tasks, we will also be dealing with speech recognition a lot. MFCC are commonly used as features in speech recognition systems, such as system which can automatically recognize numbers spoken into a telephone. So here we demonstrate how to load an audio file and visualize its MFCC. So first of all, if you are running this notebook in collab, you should upload the audio file through from here. And I have already imported the files here. So in English.wave and a bounce mono.wave. So to play the audio in the notebook environment, you can import large audio and ipython.display. And you can play and listen to the content of the audio first. Hello from the children of planet Earth. So now we want to convert this audio file into touch tensor and use that for feature extraction. And then in the actual machine learning experiment, you’ll use the extracted feature for training. So here the first step is to transform this audio into a torch tensor object. So you can use that a torch audio dot load to do that. You’ll get a torch tensor of size one time three four thousand. And you also know the sampling rate which is eight thousand. So if you look at the sampling rate and the lens of the audio, it makes sense that the shape of the audio tensor is 34,000 because four seconds time 80, seldom means around 32,000 data points. And then we’ll extract the mscc features from this data. So you can use the built-in function from torch audio transforms. So you can specify the number of features and sampling rate and you’ll get the features here. So if we print the features, you’ll see that the features are in the shape of one times fifteen times hundred and seventy one. And the 15 here corresponds to the number of features that you specify. So how do we visualize the mscc coefficients? So still use matplotlib. We can specify the title of the image and the x and y label of the axis. And lastly use i’m sure to plot the data. So let’s see. Yeah. So you’ll basically see this colored graph which is darker at the bottom and lighter at the top. You can search up the mass behind mscc to figure out why the color is darker at the bottom and lighter at the top. However, in general that’s usually the case. So if you want to check whether your mscc is loaded correctly, you can plot the data and see whether the color matches this image. So in this section we also leave you an example to try for self. So you should upload the BUNS mono audio file to the colab environment. And following the similar steps we provided above to load the audio file, obtain its mscc coefficients and plot the mscc coefficient against the time frame. Lastly, try to compare your the two mscc plots and spot what are some similarities and differences between them. Lastly, we also attached a cheat sheet for mat plot lip for beginners in this at the end of this notebook. So please download this notebook and try to do the examples above. And you can look for more advanced features by going to this link for other mat plot lip cheat sheets.