Lab 6: Makefiles and Bash Scripting

Getting Started
Makefiles
Bash Scripting
1. Part 4: Scripting for Testing
2. Part 5: Scripting for Fun
PA5 Preparation
Optional Feedback Form

Getting Started

With your partner, discuss:

If you could choose a superpower to have, what would it be? What would you do with it?
Who’s your favorite superhero/villain, if you have any?

Makefiles

Accept the assignment on GitHub Classroom (https://classroom.github.com/a/fGaAznqd) to create a repo with the starter code for this lab. Clone this repo into your cse29 folder on ieng6.

For this lab, the starter code is neatly separated into different directories labeled part1 through part5, which each correspond to the titles of the sections below. When we use make, it will look for a Makefile in the current directory to use, so this lets us create distinct Makefiles for each of them.

When moving onto a new section of this writeup, please also change into the corresponding directory!

From a prior PA, you have already seen that a Makefile is a thing that exists, and you can use it to compile code instead of running gcc manually. In this lab, we’ll go into more detail about each component of a Makefile, and how we can design them in different ways.

In software development, programmers may use more complex programs dedicated to automating building, such as CMake, which can automatically generate Makefiles. As such, we don’t expect you to need to write Makefiles from scratch for future projects. However, it is still helpful to know what a Makefile does and how, which is the learning objective of this section.

Part 1: Makefile for One

In this directory, we are given a single, very simple source code file program.c. You can look at its contents, but there’s nothing there to see (or do).

What we’re more interested in is writing a Makefile from scratch to simplify common tasks we want to do with this project.

Compiling

Open a new file called Makefile (without any file extension). This is the file that configures the functions of the make command. A Makefile mostly consists of “rules” which have the form:

target: dependencies
    recipe

In a lot of ways, you can think of defining rules in Makefiles like defining functions in C, but there are important differences.

The target could be thought of as the name of the rule. We use the target to tell make which rule should be used. Unlike functions, Makefiles expect targets to be the names of files.
The dependencies are files or other targets that the creation of the target depends on. For C programs, these dependencies are usually source code and object files.
The recipe contains the commands that are executed when make uses this rule. Recipes can have one or more different commands to be executed sequentially.

It’s not necessary to define dependencies, but we often do because Makefile automatically checks if any of its dependencies have changed more recently than the target file. If not (and if the target file already exists), then make does not bother to execute the recipe, because the target file must already be up to date. This means that make will only execute the recipe if the target file doesn’t exist, or one of its dependencies is more recently updated than the target file.

A typical example of a rule is the one below:

program: program.c
    gcc -Wall -g -std=gnu99 -o program program.c

In this rule, the target is program, which is the executable file we want to create with this rule. The recipe is a gcc command to produce program, which you would normally run manually in the terminal. Since we define program.c to be a dependency of this rule, this means that program will only be recompiled if program.c is more recently updated than program.

This example just so happens to work perfectly for the Makefile we want to write in this section, so fill in your Makefile with this rule.

Most of the time, the tried and true tools we use are at least somewhat well designed, which explains and lends to their reliability. This is not one of those times.

Furthermore, in a previous lab, we’ve given you a .vimrc file which automatically replaces tab characters you type with four spaces. Most of the time, this is helpful. This is not one of those times.

From these two pieces of information you might be able to infer the bad news: Makefiles by default require each line of a recipe to start with a tab character. Four space characters on top of each other in a trench coat does not work here! Those four spaces you see at the start of the line with the gcc command are supposed to be one single tab character in the Makefile.

To temporarily disable the expandtab setting in .vimrc, type :set expandtab! in command mode (not insert mode) in Vim and press Enter. This turns off the setting until you exit Vim.

It’s pretty difficult to distinguish between spaces and tabs in Vim alone. You can output the contents of the Makefile with the tab characters replaced by “&” with the following command: cat Makefile | tr "\t" "&". This makes use of the pipe |, which we’ll learn about in more detail very soon (but not “today” soon). The tr command replaces the tab character “\t” with “&”.

After writing this rule into the Makefile, you can then run make with the target to run the compilation command in the recipe:

$ make program

Notice that make prints out the recipe command, and, if you check the contents of the directory, executes that command to compile program.c. Try running make program again to see that make refuses to recompile program, because it’s already up to date. Then make a small change to program.c, and run make program again to see that it recompiles if program.c is changed.

This Makefile has already greatly simplified our workflow: instead of typing 44 characters to compile the program, you can type just 12 characters instead, giving you an average of 8 whole extra seconds to not understand a compile error message. But we can do even better!

On a line either before or after the “program” rule, write:

default: program

default is a keyword in Makefiles which defines the default rule to be used when make is executed without a target. You could think of this line as a rule with: a special target, the program target as a dependency, and no recipe. Running make by itself with this in the Makefile will default to executing the “program” rule.

Running and Cleaning

Although recipes typically contain commands used to create their corresponding target files, recipes can also contain any other commands you could run in the terminal. As such, some other common uses for Makefiles are to run a program and clean up after a program.

For this program, the rule for running program could be defined as:

run: program
    ./program

Remember that Makefiles require each line of a recipe to start with a tab character!

This simple rule depends on the program target, meaning that it will automatically recompile program if necessary, and runs the program. In this case, the target is not a file that we expect to compile, just a convenient name that we use to use this rule.

Similarly, we also often define a rule to clean up files that are produced from the build process. This specific example does not produce any, but sometimes it is also desirable to clean up the target file itself in order to recompile without changes to the source code.

clean:
    rm program

In most cases, this will work without issue, but in the rare case that you create a file called “run” or “clean”, the corresponding rule won’t work properly anymore. This occurs because make does not recognize that “run” and “clean” are not supposed to be files. So when a file of that name is created, the standard behavior of make causes our intended functionality of these two rules to fail: make will not use these rules unless that file no longer exists or a dependency updates. If you want to, try making a file called “run” or “clean” to see this happen.

In order to account for this edge case, we can manually define run and clean to be phony targets:

.PHONY: run clean

After defining these rules, your Makefile might look something like this:

default: program

program: program.c
    gcc -Wall -g -std=gnu99 -o program program.c

.PHONY: run clean

run: program
    ./program

clean:
    rm program

These rules (and phony target definition) can be defined in any order.

Part 2: Makefile for Many

In this section, we’ll show multiple valid Makefiles for the programs in part2. As you follow along, pick one you like the look of and use it to compile all three programs.

When we have multiple programs to be compiled in a single project, we could create a Makefile with rules for each:

default: program1 program2 program3

program1: program1.c
    gcc -Wall -g -std=gnu99 -o program1 program1.c
program2: program2.c
    gcc -Wall -g -std=gnu99 -o program2 program2.c
program3: program3.c
    gcc -Wall -g -std=gnu99 -o program3 program3.c

Notice how much repetition there is between each rule here. In this case, the repetition is just mildly annoying, but if you have more independent programs (like I do when designing lab activities), mildly annoying becomes very annoying! We’ll see how we can reduce repetition in two different ways that we’ll use together to create a very concise and flexible Makefile.

Variables

Like in C programs, you can also define variables in Makefiles. But unlike C programs, where defined variables are allocated in some memory when the program is run, variables in Makefiles just represent some string value. This lets us reduce the amount of typing we have to do when we want to, for example, change the gcc flags use in all rules. As such, some common values we can define as variables are the compiler command and its flags:

CC = gcc
CFLAGS = -Wall -g -std=gnu99

default: program1 program2 program3

program1: program1.c
    $(CC) $(CFLAGS) -o program1 program1.c
program2: program2.c
    $(CC) $(CFLAGS) -o program2 program2.c
program3: program3.c
    $(CC) $(CFLAGS) -o program3 program3.c

The variables CC and CFLAGS are defined with the values gcc and -Wall -g -std=gnu99, respectively. Then we use these variables in each of the recipes. Note that there is a special syntax when we use the variables: $(X), where X is the variable name. This syntax tells the Makefile to expand the variable X to use its value, instead of interpreting “X” as a literal string.

Pattern Rules

Each of these three rules have a similar pattern: each one is identical to the others except for a single number that changes. To eliminate this repetition, we can merge these rules into one pattern rule:

CC = gcc
CFLAGS = -Wall -g -std=gnu99

default: program1 program2 program3

program%: program%.c
    $(CC) $(CFLAGS) -o $@ $<

A couple of new symbols were introduced in this pattern rule:

A target with a “%” character creates a pattern rule. The “%” in the target can match any non-empty string, then for each corresponding match, the “%” has that same value in the dependencies. For example, this rule matches program1, program2, and program3 and defines their respective dependencies program1.c, program2.c, and program3.c. This will also define dependencies for any valid match to the target: program4 depends on program4.c, programaaa depends on programaaa.c, etc.
In a pattern rule, we use automatic variables to refer to the target and dependencies, since their exact value is not determined explicitly.
- $@ is an automatic variable which represents the target of the rule.
- $< is an automatic variable which represents the first dependency of the rule.
- Other useful automatic variables are given here.

If we were really bold (which we are), we could generalize this Makefile further:

CC = gcc
CFLAGS = -Wall -g -std=gnu99

default: program1 program2 program3

%: %.c
    $(CC) $(CFLAGS) -o $@ $<

This pattern rule now matches any name (not just names that begin with “program”) to be a target, and defines its dependency to be a file with that name plus the “.c” suffix.

In this section, we’ve developed a Makefile to be increasingly more flexible, both in making future changes easier and expanding the scope of valid targets. An important point to make (pun intended?) is that each of these Makefiles is a valid Makefile for compiling the three programs given in this directory, and they have their own pros and cons. For example, a Makefile similar to the last one was used in last week’s lab to easily compile programs with different names, where the compilation process is the same across programs. However, it might be undesirable to enable the programmer to attempt compiling any file ending in “.c”. On the other side, the first Makefile might be a good fit for a use case where we know we will customize the build process for each program, but this could lead to a very large Makefile.

Part 3: Linking Object Files

When we use gcc to manually compile programs, we typically compile directly from the source file to the executable program. But remember that the build process involves multiple steps with intermediary files. One of these intermediary files are object files, which are produced by the assembler and are linked into the executable file.

The linking process resolves symbols between object files, meaning that functions defined in one file can be used in another. In part3, an implementation of a linked list is given in linked_list.c. The corresponding header file, linked_list.h, contains function declarations to be shared between source files. Then, in one_list.c, we do things with a linked list.

We can use the following gcc commands to create then link the object files:

$ gcc -Wall -g -std=gnu99 -c -o one_list.o one_list.c
$ gcc -Wall -g -std=gnu99 -c -o linked_list.o linked_list.c
$ gcc -Wall -g -std=gnu99 -o one_list one_list.o linked_list.o

We create one_list.o from one_list.c, create linked_list.o from linked_list.c, then link the two to produce the executable one_list. My fingers hurt from all that typing; I wish there was an easier way to MAKE all these files…

An example of such a Makefile would be:

CC = gcc
CFLAGS = -Wall -g -std=gnu99
TARGET = one_list
OBJS = one_list.o linked_list.o

default: $(TARGET)

linked_list.o: linked_list.c
    $(CC) $(CFLAGS) -c -o $@ $<

one_list.o: one_list.c
    $(CC) $(CFLAGS) -c -o $@ $<

$(TARGET): $(OBJS)
    $(CC) $(CFLAGS) -o $(TARGET) $(OBJS)

run: $(TARGET)
    ./$(TARGET)

clean:
    rm $(TARGET) $(OBJS)

Here, we make extensive use of variables for the ultimate target (one_list) and its prerequisite object files (one_list.o and linked_list.o) so that we can easily use these strings in multiple places, e.g. in both the compile command and in the rm command.

Notice that the two rules for creating the object files have identical recipes. Try writing a pattern rule that matches both linked_list.o and one_list.o with their respective dependencies. This new pattern rule will replace the rules for linked_list.o and one_list.o in the Makefile. Then try compiling one_list to check that the Makefile works.

Hint

The recipes of the two rules are the same, so that part stays the same in the pattern rule. What parts of the target and dependency names are the same in both rules and which are different? Use the “%” character to match the part that is different.

Hint

… oh, you want another hint already? If you’re really stuck (or just want to move on), I suppose you can have this “hint”.

A "hint"...
... is sometimes itself a solution. `%.o: %.c`

Bash Scripting

Bash scripts are files which contain sequences of commands to be executed when the script is executed. This gives us a way to automate a series of commonly used commands.

Part 4: Scripting for Testing

In PA 4, we suggested a method to test out sequences of commands in webster by feeding in a file to the program. This already simplifies the “command-typing” part of testing, but what about the “cross-check-with-reference” part of testing?

To simplify cross-checking, we will introduce the diff command. diff is a command which takes in two files as input, and prints out a report of the difference in contents between the two files. Try out diff on two test files that we’ve given to you:

$ diff tests/load_short.txt tests/load_shortest.txt

In the output of diff:

1c1: denotes that line 1 in load_short.txt and line 1 in load_shortest.txt: differ in content.
< load dict/short: shows that line 1 of load_short.txt contains “load dict/short”.
> load dict/shortest: shows that line 1 of load_shortest.txt contains “load dict/shortest”.
---: just a separator, like those separators for the checkout conveyor belt at the grocery store.

If the two files are identical, then diff prints nothing.

man, I wish there existed a command that would tell me what other commands do in great detail!

Here, we’ve given you a (broken) webster program to try testing with. To use diff to cross-check the webster output against the reference program on a specific test file, we can use the sequence of commands:

$ ./webster < tests/load_short.txt > my_output.txt
$ ./ref-webster < tests/load_short.txt > ref_output.txt
$ diff my_output.txt ref_output.txt

In order to use the output of each webster program in diff, we save the outputs to files with redirection >. As the name suggests, it redirects the output of webster to the file, meaning that it does not print to the terminal.

Running each of the three commands in sequence for every test would be awful, and needing to edit the middle part for each test file even more so. In run_one_test.sh, we’ll create a script to automate this:

#!/bin/bash

./webster < $1 > my_output.txt
./ref-webster < $1 > ref_output.txt
diff my_output.txt ref_output.txt

The #!/bin/bash at the top specifies that this script should be parsed using the bash program at the directory /bin/. The leading #! is called a “shebang” and signals the start of this directive.

The contents of the script are nearly the same as the sequence of commands we just executed. In place of the path to the test file in the webster program commands, we have written $1. This is a special variable in bash scripts that gets the first argument to the script. We want to pass in the test file path like so when we run the script:

$ ./run_one_test.sh tests/load_short.txt

This is a simple example of a bash script to show how we can automate sequences of commands. We can also create more complex bash scripts with conditionals and loops that allow us to do more advanced tasks. A run_tests.sh script is given to you as an example.

This script takes in a path to test file(s), runs each test on the webster program, checks the output against ref-webster, produces a file containing the output of diff in the results directory, and prints out a report of how many tests failed, if any. Try this out on every test file in the tests directory, remembering to first create the results directory:

$ mkdir results
$ ./run_tests.sh tests/*

We will not go over the syntax of bash scripting in detail here, but the contents of run_tests.sh are annotated with comments if you’d like to explore this script.

If you are still working on PA 4, feel free to make use of this script to automate testing!

Part 5: Scripting for Fun

With sufficient complexity, bash scripts can be their own standalone programs. Some fun examples of bash scripts are given here, which include an in-terminal Wordle game:

./wordle.sh

and a script to render Pokemon sprites from National Dex numbers. Try out:

./pokeget.sh 25

Who’s that Pokemon?

If you’d like to learn more about these specific scripts, their sources are linked at the top of their files.

PA5 Preparation

One of the important skills of PA5 that you will need to know is bit manipulation. That’s what we’re going to practice today!

You will be implementing three different bit manipulation functions to get your comfortable with extracting information from bitstrings: set_nth_bit, sub_bitstring, loop_mask. I recommend you attempt them in that order.

The method signatures have been provided for you in bitstrings.c, along with a helper function called print_bin, which will print out your bitstring in binary format. You may have noticed the size_t type, this type is essentially an unsigned integer with the size of the maximum possible address on your machine’s architecture. Don’t worry too much about this.

The print_bin function provided in the starter code is incorrect, since it prints bits in the incorrect order, please replace it with the following version of print_bin before starting!

void print_bin(size_t n) {
  int i;
  unsigned int mask = 1 << (sizeof(int) * 8 - 1);

  for (i = 0; i < sizeof(int) * 8; i++) {
    if (n & mask) { 
      printf("1");
    } else {
      printf("0");
    }
    mask >>= 1;
  }
}

Here is a code snippet with usages of common bitwise operators you may find useful, taken from Geeksforgeeks:

    // a = 5(00000101), b = 9(00001001)
    unsigned char a = 5, b = 9;

    // The result is 00000001 (AND)
    printf("a = %d, b = %d\n", a, b);
    printf("a&b = %d\n", a & b);

    // The result is 00001101 (OR)
    printf("a|b = %d\n", a | b);

    // The result is 00001100 (XOR)
    printf("a^b = %d\n", a ^ b);

    // The result is 11111010 (NOT)
    printf("~a = %d\n", a = ~a);

    // The result is 00010010 (left shift)
    printf("b<<1 = %d\n", b << 1);

    // The result is 00000100 (right shift)
    printf("b>>1 = %d\n", b >> 1);

    return 0;

Do try to attempt these on your own at first, and don’t worry if you struggle since these are some challenging functions to implement, especially when you’re first learning bitwise operators. After giving your best shot, please feel free to collaborate with your fellow students and/or ask the tutors and TA for help. Good luck and have fun!

Optional Feedback Form

If you’d like to give feedback on how labs are conducted and how they can be improved, please feel free to submit any comments in this anonymous form. This is a space for you to drop any comments you have at the end of every lab!