11-785 Spring 2023 Recitation 0G: Debugging and Visualisation (Part 2/3)
Hi everyone, this is the part 2 of Recipes 0G, debugging of deep neural networks. In this video, we’ll go through 3 types of basic coding errors, syntax errors, logic errors, and runtime errors. After following tips in this section, you should be able to get your code running smoothly. Firstly, the most common beginner error faced are syntax errors. All codes in this course will be written in Python, so please go through homework 0 and recitation 0A to warm yourself up for Python syntax. And also, stack overflow is always your best friend, so you can easily find solutions to syntax errors by just searching the error message in stack overflow. Secondly, logic and math errors are commonly seen in this course as well as we are using a lot of matrix manipulation and math. So to tackle this kind of problem, you should firstly read the homework right up very carefully and try to understand what each dimension of the matrix represents. And based on that, check whether your shape of the variables makes sense. Also you’ll use a lot of numpy library functions for matrix manipulation. Sometimes you might have errors for calling the wrong functions or passing wrong parameters to the functions. Hence, please read the numpy documentation of a function carefully before calling it. Lastly, you’ll see some wrong term errors when debugging, which is probably the most headache out of the three types of errors you see. For example, if you have a function that could potentially lead to a division by 0 error, but is not easily exposed, unless the input is some specific values, then your code appears to run normally for a few loops until it stops randomly and throw you an error. So what you can try in this case is to read the trace back to find the root error and read the library documentation for function specifics. Also if you are training on CUDA, the error message are usually too vague. So you can try setting the batch size to 1 and run the code on CPU, which often generates more helpful error messages. Also there are great debugging tools you can use to debug Python code. TDB is an interactive Python debugger and now we have a code demo to show you how to use it and why it is useful. In this notebook, we are going to guide you through how to read a trace back error message and how to use pdb for interactive Python debugging. So first of all, let’s go through how to read trace backs. So most of the time when a Python script fails, it will raise an exception and when the interpreter hits one of these exceptions, information about the cost of the error can be found in the trace back, which can be accessed from within Python. So if we run the following code, you’ll see the Python exception here. And it’s not very obvious if we just look at this single line, why the error is occurring, but if you read the trace back carefully, first you’ll know that it’s a division by zero error and also you can expand this stack to see where exactly are the lines that could potentially cause this error. So if you go back one stack, if you go back one frame, you’ll see that it’s called in function one and then like in function one, this specific line, a divide, b cost this issue. So in Collab, there’s this one very useful feature that you can use to jump right to the point where error message occurred. So if you click the blue link here, you’ll open a cell at the site and with your cursor stopped at the line there might have caused the error. So it might not seem to be very useful in this example, but in real life when you’re doing your homework, your code could have hundreds of lines. And it’s very hard to track, scroll up and down to locate the error message. So it’s very useful if you can just open a cell at the site to modify the code. And also another very useful feature of Google Collab is that you can just click the search stack overflow button here and it will lead you to the Google page and you can mostly find some solution from stack overflow pages. Okay, so now let’s go through an example that’s that race an error that’s more similar to the error message that we will see in real deep learning code. So we don’t for now we don’t have to understand what each line of code is doing. So we are mainly looking at the error message generated. So as you can see here, first we can explain the error message trace back and you’ll see a very, very long error message everywhere and it’s some of them look very complicated and you don’t quite understand it. However, it’s you shouldn’t be too afraid about it. So although it looks complicated, most of them are tracing into the imported Python libraries in this case, PyTorch. So we can assume that there’s no bug introduced in this widely used libraries like PyTorch and NumPy. So when debugging, you can just read the frames until right before entering the frames for Python library functions. So here we can see that this line and this line are potential calls for the error message. Okay. So we’re reading trace back. It’s still hard to understand why some errors are occurring. So here we introduce you to this interactive Python debugging tool called PDB. So this debugger lets the users step through the code line by line in order to see what might be causing a more complicated error. So perhaps the most useful and convenient interface to debugging is the debug magic command. So if we run the line that cost the Python exception, then we run this debug magic. You can see this prompt automatically gets generated and you can interact with the code from here. So for example, let’s do something simple. So we want to figure out at this point what exactly is the value of A and B. So we can type P A for a P stands for print and A is the variable name A. So you’ll tell you that the value of A at this step is one. And similarly, we can print the value of B at this point. So you’ll see that the value of B is zero and now it’s very clear a wireless causing the error. So to put this debug interface, we type Q for quit. Furthermore, this debugger interactive debugger allows more than printing the values of variables. You can also step up and down the trace back frames to see what’s going on. So if we go up your jump to the place in the last frame and you can print some other variables, for example X, you can go one more frame and also you can go down frames and you can specify the number of frames to go up or down by typing a number here. So now we are back to A divide by P. Furthermore, if you want the debugger to launch automatically whenever an exception is raised, you can use the PDB magic function to do this automatic job for you. So you type PDB on and then you write that will generate the error only at one point. Okay, it will stop exactly at the point where the error is generated and here you can similarly print A and print B to inspect the values here. Finally, if you want to monitor the behavior of your code step by step in the interactive mode, you can import PDB and use this line of code PDB.set trace. So if you have heard of breakpoint, so this line of code basically adds a breakpoint into your code and every time your code wrong to encounter a breakpoint, it will stop and prompt the interactive prompt. So here it stops and you can type next to make it go to the next step, but you can type C to point to continue until the next breakpoint. So at the end of the notebook, I attached a few useful command that you can use with PDB. So feel free to explore all these options and there are more information to learn about PDB through PDB’s online documentation here. Thank you.