Lab 3: Debugging with GDB

Table of contents

  1. Getting Started
  2. SSH Keys for ieng6
    1. Generating SSH Keys
    2. Copying the Public Key to the Server
    3. Managing SSH Keys
  3. Some More Vim!
    1. Moving Up and Down
    2. Custom Configuration
      1. Create your .vimrc
      2. Viewing Hidden Files
  4. What and Why Debugging?
  5. Debugging Activity
    1. Compile for GDB
    2. Running in GDB
    3. Breakpoints
    4. TUI in GDB
    5. Basic GDB commands
    6. Segfault and Backtrace
  6. PA Preparation
  7. Optional Feedback Form

Getting Started

What is your favorite movie, or (if you prefer) the last movie you saw? What genre is it?

Discuss with your partner:

  1. Who is your favorite character and why?
  2. What is your favorite thing about the movie?

Throughout this lab, we strongly encourage you to help each other. The staff is always there to help, but do try working together and helping each other out first.

SSH Keys for ieng6

We’ve already set up SSH keys for GitHub access in Lab 2. This time, we’ll set up SSH keys for access to ieng6. This means that we’ll need to generate the keys locally on your machine, then copy the public key to ieng6. Once set up, you won’t need to enter your password every time you log in or copy files because the ssh command can use your key files to grant access in place of your password.

Generating SSH Keys

On your local machine (NOT on ieng6), run the ssh-keygen program:

$ ssh-keygen -t rsa

First, you will be prompted to type a path location on your machine to save the private key that will be generated. Press Enter to save the file at the default path location, which will be somewhere like /Users/<user>/.ssh/id_rsa (this might not be the same for your device, but that’s fine, just make a note of what it actually is), where <user> will be your username.

Enter file in which to save the key (/Users/<user>/.ssh/id_rsa):

When prompted to enter a passphrase, just press Enter to continue without setting one.

This will generate the two key files on your computer, in the .ssh directory, within your home directory. The private key will be saved in a file named id_rsa and the public key will be saved in a file named id_rsa.pub.

You can verify that these files were created using the ls command:

$ ls ~/.ssh

Copying the Public Key to the Server

Next, log onto your ieng6 account and check that the ~/.ssh directory exists. If not, create this directory with mkdir. Replace username in the command below with your ieng6 account username.

$ ssh username@ieng6.ucsd.edu
$ cd ~
$ mkdir .ssh

After creating the .ssh directory inside your home directory, you can log off ieng6.

Now that we have already generated both of our key files, we need to copy the public key file to our remote machine or server (ieng6) that we wish to connect to with ssh.

Make sure you copy the public key file (id_rsa.pub), which ends in a .pub extension to the server, not the private key (id_rsa).

Run the following scp command shown below to copy your public key, named id_rsa.pub, into the file named authorized_keys within the .ssh directory that you just created on the ieng6 server.

Run the following command on your local computer, and replace <user> and <username> your own usernames for your personal computer and the ieng6 server respectively. The path to id_rsa.pub might be different for you, depending on what the ssh-keygen program outputted.

$ scp /Users/<user>/.ssh/id_rsa.pub <username>@ieng6.ucsd.edu:~/.ssh/authorized_keys

Try logging into your account again. You shouldn’t be prompted to enter a password this time!

$ ssh username@ieng6.ucsd.edu

Managing SSH Keys

In this class, we will mostly be using the ieng6 server. However, in the future you might have access to multiple servers for which you want to create separate SSH keys. To help us keep track of our SSH keys, we can create a configuration file called an SSH config file.

Exit the ieng6 server one more time, and create this SSH config file inside your .ssh directory that is in your home directory on your computer.

$ vim ~/.ssh/config

Copy the following lines and paste them into the config you just created and opened.

Host ieng6
    HostName ieng6.ucsd.edu
    User <username>
    IdentityFile ~/.ssh/id_rsa

Replace <username> with your ieng6 username. The IdentityFile field specifies the filename for the private key created to access the given server.

Now try to login to your ieng6 account again. This time, you can type the ssh command in a shorthand form using the name provided in the ssh config file after Host, which is ieng6 in this case.

$ ssh ieng6

This configuration file can contain many entries to manage access to multiple sets of ssh keys for various servers that you might have access to in the future. And it will save you some typing also. How nice!

Some More Vim!

Let’s learn about just a few more Vim commands that might be helpful for the labs and programming assignments.

Moving Up and Down

You may have noticed that moving up and down line by line in a Vim file can be relatively slow. Scrolling up and down the file would be faster!

To scroll down in the file, hold the Ctrl key and press d.

To scroll up in the file, hold the Ctrl key and press u.

Custom Configuration

So far, we have seen that Vim commands can be manually typed in an open Vim window, however this can quickly become repetitive if you commonly want to execute the same commands every time you open a new Vim window.

Alternatively, we can tell Vim to execute a certain set of commands automatically every time we open a Vim window in the future. For example, we can type a command to enable syntax highlighting in Vim, which will add syntax-specific coloring to your code and will vastly improve your experience reading through code in Vim. Another Vim nicety we might want to enable is adding line numbers to the file. There are many, many different commands and settings that we can enable and configure to create our own personal Vim experience, which will make writing code in Vim much more enjoyable!

We will do this by putting these commands directly into a Vim runtime configuration file, called a .vimrc file.

The . prefix means that it is a hidden file in UNIX, also commonly called a dotfile. This means that these files will not display, by default, when listing the contents of a directory (using the ls command).

By convention, you will create the .vimrc file in your home directory: ~/.vimrc

Remember: the tilde character ~ is a shorthand for the current user’s (your) home directory in UNIX.

Create your .vimrc

Using Vim, create an empty .vimrc file in your home directory on your ieng6 account:

$ vim ~/.vimrc

Then copy the following lines into your empty .vimrc file, as shown in the code snippet below:

syntax on
set number
set belloff=all
set showtabline=2   " always show buffer tabs
set tabstop=4
set expandtab   " expands tabs to space
set autoindent
set smartindent
inoremap { {<CR>}<Esc>ko
set backspace=indent,eol,start

The effect of each of these commands is summarized briefly below:

  • syntax on: enable syntax highlighting
  • set number: display line numbers in Vim
  • set belloff=all: disable Vim bell sounds
  • set showtabline=2: always show buffer tabs
  • set tabstop=4: set tabs to be four spaces
  • set expandtab: expands tabs to space
  • set autoindent: apply indentation to next line based on current line
  • set smartindent: apply indentation with respect to code syntax
  • inoremap { {<CR>}<Esc>ko: autocompletion of curly braces for ease of use

In general, you should not paste a command into your .vimrc if you are not sure what it is doing. But you can definitely trust us! :D

These are our suggestions for .vimrc settings that we think would be helpful for you in this class. If you feel like you don’t like some of these features, feel free to remove the corresponding lines in the .vimrc file. These configuration files are usually customized to each programmer’s preferences—figure out what works well for you!

Viewing Hidden Files

You can now save and exit your .vimrc file. Check to make sure that you have successfully created this file in your home directory, using the ls command.

$ ls ~

Do you see your newly created .vimrc file? If not, why do you think that is, and how would we be able to view any hidden files in a directory?

Hint

Check the manual page for the ls command by typing: man ls into the terminal.

The man page or manual page, is software documentation resource built into the Unix terminal, where you can find descriptions and usage information about various Unix commands, such as ls, cd, etc. This can be a very helpful resource, since it is directly accessible from within your terminal, if you are ever unsure about how to use a certain Unix command.

Once you have determined the correct usage of the ls command, including the ability to also view files starting with ., run the command in your home directory. Do you see your newly created .vimrc file now?

There are many commands you can add to your .vimrc over time, to create your own personalized setup. With enough setup and customization, Vim can provide many of the same features as modern IDEs (Integrated Development Environments), like VS Code for example.

What and Why Debugging?

Debugging is the practice of finding and fixing errors in programs. There are no real bugs in your program, although this has actually happened at least once before.

Like programming, debugging is also an incremental process. Programmers often need to isolate and debug one error at a time, rather than all errors at once.

And sometimes there are bugs within bugs.
And sometimes there are bugs within bugs.
And sometimes there are bugs within bugs.
And sometimes there are bugs within bugs.
And sometimes there are bugs within bugs.
And sometimes there are bugs within bugs.
And sometimes there are bugs within bugs.
And sometimes there are bugs within bugs. But there is eventually an end!

We could insert some print statements into our code to try and figure out what’s going on. Though, unless we already know specifically which variables we want to print out, we would have to try printing out a bunch of different variables and hope some of them will reveal an error. A stubborn programmer can (theoretically) debug anything with just print statements, but this can get really messy, both in the terminal and in the code itself.

GDB is a command line program that offers a convenient way for us to pause execution at any point in the code, manually inspect the values of any variable in the scope, and walk through the program step-by-step.

GDB stands for the GNU Project Debugger. GNU is a collection of free software, which originally started as a project to create a free operating system, pronounced “g-noo” (or the spicier /ŋu:/ (or the more ambitious /g͡nu:/)).

GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which
GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which
GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which
GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which
GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which
GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which
GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which
GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which
GNU stands for GNU's Not Unix, in which GNU stands for GNU's Not Unix, in which there actually isn't an end to this one, but i can't afford to make this lab writeup infinitely long sorry i appreciate your persistence though

Debugging Activity

Accept the assignment on GitHub Classroom (https://classroom.github.com/a/Q8l79hp4) to create a repo with the starter code for this lab. Clone this repo into your cse29 folder on ieng6.

This starter code contains two programs which we will debug with GDB: reverse.c and pointer.c. We’ll start with the reverse.c program to introduce basic GDB commands, then look at pointer.c to talk about segmentation faults and how to debug them with GDB.

Compile for GDB

The correct behavior of reverse is to reverse the array {1,2,3,4,5}, then print out the reversed contents. Compile reverse.c and try running it. How is the actual output different from the expected output?

In order to use GDB, we have to compile our program with the -g flag. This tells the compiler to add some extra information to your executable file, which GDB will use. Recompile reverse.c with this flag.

$ gcc -o reverse -Wall -std=gnu99 -g reverse.c

Running in GDB

To run the program in GDB, use the gdb and pass the program as an argument:

$ gdb ./reverse

This will spit out several lines of text, with some software and legal information that we don’t need to worry about. In the last line, you should notice that the terminal command prompt $ (as well as the other stuff before $) has been replaced by the GDB command prompt (gdb). This means that the GDB program is active and ready to receive commands.

Use the run command to run the program in GDB.

(gdb) run

This runs through the program without stopping. GDB might say something here about “Missing separate debuginfos”, but don’t worry about this. GDB is just asking to install some extra packages to get more information for debugging, but these aren’t necessary for our purposes right now.

To quit out of GDB, you can use the quit command.

(gdb) quit

Breakpoints

To pause execution at some point in the program, we have to set a breakpoint. Use the break command and give it a location in the source code that you want to pause execution at. At the moment, we’re not too sure exactly which part of the code is wrong, so we’ll pause at the beginning (the main function) and go step-by-step from there. You can use either

(gdb) break reverse.c:14

to set a breakpoint at line 14 in the file reverse.c, which happens to be where the main function starts, or more simply

(gdb) break main

to automatically set a breakpoint wherever the main function is. You can set breakpoints at any line a source file, but usually we find it helpful to set them at the start of functions, to investigate the behavior of a function from its beginning.

You can use the info breakpoints command to list out which breakpoints have been set, as well as some information about where they are and what their associated number is. You can delete breakpoints with the command delete <number> where <number> is the number associated with the breakpoint.

After setting a breakpoint, you can run the program again and see that it pauses execution right where the breakpoint is.

TUI in GDB

Notice that GDB printed out a single line of code from the source file. This is the line of code that is about to be executed next. Here, you can use the list command to print out some of the surrounding code. This can be helpful to contextualize where the code is running.

The layout src command may also be helpful to enable a TUI (Text User Interface) for GDB that automatically renders a portion of the source code in the top half of the screen. The highlighted line in the TUI shows which line of code is to be executed next. To disable the TUI, use Ctrl + X + A.

(gdb) layout src

The TUI won’t render any source code unless there is a program that is active, i.e. in the middle of execution.

Sometimes the TUI can get messed up if the program prints something out (which reverse does). When this happens, try using the refresh command to refresh the TUI and hope that it will restore its correct format. If something seems really messed up, try disabling and re-enabling the TUI.

On the left hand side of the TUI, you can see that the breakpoint is marked with B+>.

Basic GDB commands

After setting a breakpoint and running, execution is paused right after we enter the main function, before any other code is executed. You can verify this by using the print command to print out the contents of the array variable arr:

(gdb) print arr

and see that it contains some weird values. We see this because the next line of code, which initializes the array, has not been executed yet.

To run the next line of code, we use the next command:

(gdb) next

The highlighted line has moved onto the next line, indicating the the previous line has been executed. You can verify this by trying to print out the contents of arr again to see that it contains the correct initial values.

Run the next line of code (which calls the reverse() function) with next, then print out the contents of arr. Are these the expected values of arr after calling reverse()?

We know now that something went wrong inside the reverse() function, so we want to look into this function with GDB. Let’s restart the execution by using the run command again and responding y to the prompt asking to start the program from the beginning.

Use next to execute the program up until, but not actually executing, the call to reverse(). You’ll know that reverse() is next to be executed, but not executed yet, when the line of code is highlighted. This time, instead of using next to execute the reverse() function call, use the step command to step into the function and begin executing line-by-line from inside:

(gdb) step

GDB also provides shortcuts for some commonly used commands, which can be used in place of the full command name. Some of those include: r for run, q for quit, b for break, p for print, n for next, and s for step.

In addition to shortcuts, inputting no command and pressing Enter will automatically execute the most recently used command. This can be helpful when you need to use next repeatedly.

Then continue using next to run through the loop and figure out why reverse() isn’t working. Try printing out the relevant variables as you go through each iteration.

When you try printing out arr like you did while in main(), you’ll realize that it prints out some hexadecimal number, instead of its contents. This happens because arr is passed into reverse() as a pointer to the start of the array. The hexadecimal number you see is the address of the start of the array. You can use print *arr (arr with the dereference operator *) to dereference the pointer and get the value at the start of the array, or print *arr@NUM to print out the values stored at address arr and the next NUM addresses. This means you can use print *arr@5 to print out all of the contents of the size 5 array.

Once you figure out why reverse() isn’t working, implement a fix for it. Push this fix to your repo.

This buggy program demonstrated an example of a logical error. A logical error is one in which the behavior of the program is different than what we expect it to be, without refusing to compile or crashing at runtime.

Segfault and Backtrace

Another common type of error is a segmentation fault (or just segfault). Segmentation faults can happen when we try to access someplace in memory illegally (yes, that’s the technical term), usually an attempt to access memory from a pointer set to NULL. In pointer.c, we’ll look at an example of that.

Compile and try running pointer.c. It should immediately crash and give you a Segmentation fault (core dumped) error, telling you that some kind of illegal memory access happened. Most unhelpfully, the error message does not care to tell you why or even where it happened. Somewhat helpfully, GDB can at least answer the “where”.

Start GDB with the pointer program. This time, instead of setting a breakpoint, we can let the program run through and give us a segmentation fault. GDB automatically tells us where the segfault happened and gives us a line number. From the output, we can see that the segfault occurred in set_value(), where we attempt to dereference the pointer and set the value of its referenced memory. But main() calls set_value() three times, which one is the one causing the error? (if you can figure out which just by reading the code, please pretend to be surprised in the next paragraph)

Here, we can use the backtrace command (or the shorter bt) to show a backtrace (also called a stack trace) of the functions that were called leading up to the segfault. The output shows that the segfault happened in set_value(), when set_value() was called from main() at a certain line. This helps you discern which specific call to a function causes a segfault, when you have multiple calls to the same function.

Although GDB tells us that the error occurs in set_value(), it doesn’t necessarily mean that this specific function must be changed to fix the error. What value does the parameter pointer have when the program fails and is this the value that the corresponding argument should have?

When this program is fixed, it should output the value of a twice (after it has been set to 5), and the address of a twice.

Push this fix to your repo as well. If you forgot to push the fix to reverse.c as well, be sure to push that as well.

If you want to specify multiple files at once in UNIX, you can use the wildcard character * to select all files that match some pattern. For example, if you want to add all files that end in .c to git, you can use git add *.c. Think of the * as a representation for any string of characters. The wildcard character can be used with any UNIX command, not just git.

PA Preparation

Here is a fun little exercise to further practice what you’ve learned and get better prepared for the PA. In particular, we will be exploring the getopt function in C, which you will be required to use later on in PA3. If you already know what getopt does, great! If not, instead of reading the docs, I say we take a more hands on approach.

Included with your starter code, grader.c is a toy grader program that we (you) will use to explore what the function getopt. Once you’ve read through the program to get a feel for what it does (it’s okay if you don’t fully understand it yet!), compile it with the -g flag.

Next, run the program a couple of times and see what it outputs! There are some sample commands below, but feel free to try and experiment on your own.

$ ./grader 90
$ ./grader -p 90
$ ./grader 10 20 30 40 50 60 70 80 90 100
$ ./grader -p 10 20 30 40 50 60 70 80 90 100
$ ./grader -c 20 -p 55 65 75

Finally, try dissecting this code to try and understand what the getopt function, along with the optind and optarg variables store. Note that to run GDB on a process that uses options, you have to use the --args option. So if we wanted to use GDB to debug that last command, it would look something like

$ gdb --args ./grader -c 20 -p 55 65 75

Here are a couple of guiding questions:

  • What do the opt, optind, and optarg variables hold throughout the execution of the program?
  • Why do we check if optind < argc?
  • What happens if we run the program without an argument to the -c option? (something like ./grader -c -p 90 ...)
  • Why do we use optarg only when opt == 'c'?
  • What does the string "c:p" mean?

Feel free to modify this program as much as you’d like. If you’re feeling bored and/or adventurous, try using what you’ve learned to expand the functionality of this humble little grader.

Once you’ve explored to your heart’s content, try to write as detailed a description as possible on what getopt, optind, and optarg do, as if it’s for somebody who has never seen any of this in their lives and needs to learn this for a PA due in less than a week. Here is the official documentation for getopt. How close did you get?

Optional Feedback Form

If you’d like to give feedback on how labs are conducted and how they can be improved, please feel free to submit any comments in this anonymous form. This is a space for you to drop any comments you have at the end of every lab!