C Programming

Motivation

The goal of this tutorial is to show how to efficiently program in the C Programming Language as both required courses, COMP 206 and COMP 310, use the language without proper introduction to tooling. It is suggested, but not required, to learn a proper console-based text editor (e.g. vim or emacs) so you can both code and use the tools in the same terminal.

Coding Style

Probably one of the most crucial aspect of efficiently coding (in any language), but also the one I do not want to elaborate about: adopting a consistent coding style. Inconsistent code is hard to read and write, which invariably leads to a higher bugs count, which translate to more time wasted on debugging. No respectable company will tolerate a bad coding style, so better start now.

If you do not know what style you should adopt, I suggest a coding style that survived 30 years in a codebase of more than 10 millions lines of code. The Linux Kernel Coding Style should scale well for all your C projects.

Compiling

The 2 main compilers available on CS systems are gcc and clang which conveniently have almost the same interface. This mean the option that will be discussed can be applied to both compilers. It is good practice to make sure your code compiles fine on more than one compiler as different compilers might generate complementary warnings.

There is a number of options you should always pass to your compiler to get the most out of it:

  • -pedantic will ensure the code you write is standard C, meaning no usage of non-portable compiler extensions, meaning your code should compile fine on any C compiler.
  • -Wall will emit most warnings the compiler supports.
  • -Wextra will emit extra warnings not covered by -Wall.
  • -g generate complete debugging information.

With the final command looking something like this:

$ gcc -g -pedantic -Wall -Wextra *.c

You should then fix your code until there are no errors or warnings left. It might be tempting to ignore some warnings, but they are most likely a symptom of bad coding practices at the very least.

You should note it is considered good practice to code in a top-down fashion; starting by coding the bare minimum needed to get a successful compile and then filling the previously defined empty functions as you implement the specific functionalities of your program.

This have the added advantage of not having a tremendous number of errors and warnings once everything is done and ensure you fix any obvious design flaws early on.

Ensuring Correctness

Both the C and C++ Programming languages have a certain peculiarity that makes them very different than a language like Java for example: a program might compile successfully while being invalid.

More specifically, there are syntactically valid programs for which the compiler will happily, or with warning(s), generate an incorrect binary. We say of such a program that it contains undefined behaviour.

A lot of students have difficulty grasping this concept, but it is impossible (or at least extremely inefficient) to try to reason logically about the flow of a program which contains undefined behaviour.

Take the following incorrect code as an example:

#include <stdio.h>

int
main(void)
{
	printf("This might or might not print anything...");

	int *a = 0x0;
	*a = 0;

	return 0;
}

In the example above, the program is invalid because I try to write to a memory segment that was not properly allocated. Even though the print statement comes before the invalid memory access, at least on my machine, the print statement is omitted and the program immediately yield a segmentation fault.

If there is one thing to take out of this example it is that ensuring the program is properly defined comes before logically debugging your code. Another takeaway is that debugging a C program with print statements is highly inefficient and error-prone.

Valgrind

valgrind is the most useful tool when programming in C. By default it acts as a memory checker and will warn against pretty much every invalid use of memory and then some.

For the purpose of this tutorial, we will say that your C program is expected to be valid (at least with high probability) if you compiled the program with the compiler options described above, fixed all warnings and errors returned by both gcc and clang, and then fixed all errors and warnings returned by valgrind's memory checker.

Let us run valgrind on the code snippet above:

$ clang -g -pedantic -Wall -Wextra *.c  # Note the absence of any warning...
$ valgrind ./a.out

==29785== Memcheck, a memory error detector
==29785== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==29785== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==29785== Command: ./a.out
==29785==
==29785== Invalid write of size 4
==29785==    at 0x40053E: main (main.c:9)
==29785==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==29785==
==29785==
==29785== Process terminating with default action of signal 11 (SIGSEGV)
==29785==  Access not within mapped region at address 0x0
==29785==    at 0x40053E: main (main.c:9)
==29785==  If you believe this happened as a result of a stack
==29785==  overflow in your program's main thread (unlikely but
==29785==  possible), you can try to increase the size of the
==29785==  main thread stack using the --main-stacksize= flag.
==29785==  The main thread stack size used in this run was 8388608.
This might or might not print anything...==29785==
==29785== HEAP SUMMARY:
==29785==     in use at exit: 0 bytes in 0 blocks
==29785==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==29785==
==29785== All heap blocks were freed -- no leaks are possible
==29785==
==29785== For counts of detected and suppressed errors, rerun with: -v
==29785== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

A lot of noise, but the important part is this:

==29785== Invalid write of size 4
==29785==    at 0x40053E: main (main.c:9)
==29785==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Telling exactly what is wrong: an invalid write. Where it happened: file "main.c" at line "9". And why it is wrong: the address "0x0" was not allocated for this program. At the end there is a summary of the heap usage which will let you know if your program has a memory leak (doesn't free some memory allocated) and the total number of errors encountered.

Here the print statement did happen, but that is because valgrind runs the code in a way to make it more deterministic for its analysis. Running your code in valgrind might also uncover portability bugs by behaving differently from running the program as-is.

#include <stdio.h>
#include <stdlib.h>

int
main(void)
{
	printf("This is guaranteed to print something!");

	int *a = malloc(sizeof(int));
	*a = 0;

	free(a);

	return 0;
}

Fixing the program and running valgrind again confirms it is now a valid C program.

$ clang -g -pedantic -Wall -Wextra *.c  # Note the absence of any warning...
$ valgrind ./a.out

==3457== Memcheck, a memory error detector
==3457== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==3457== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==3457== Command: ./a.out
==3457==
This is guaranteed to print something!==3457==
==3457== HEAP SUMMARY:
==3457==     in use at exit: 0 bytes in 0 blocks
==3457==   total heap usage: 2 allocs, 2 frees, 1,028 bytes allocated
==3457==
==3457== All heap blocks were freed -- no leaks are possible
==3457==
==3457== For counts of detected and suppressed errors, rerun with: -v
==3457== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

You can remove the free(a); from the code and see how it affects the output of valgrind as an exercice.

Debugging

Once your program is well coded, generates no warnings or errors from both the compilers and the memory checkers, but the behaviour is still unexpected: then you most likely have a logical error in your code.

If that happens, you will want to use a debugger. gdb is the most common C debugger in the UNIX world and I will try to show you how to use it as concisely as possible. First, lets fire it up:

$ clang -O0 -ggdb -pedantic -Wall -Wextra *.c
$ gdb -tui ./a.out

First, note that:

  • I changed the -g option to -ggdb to generate debugging symbols optimized for usage in gdb.
  • I added the -O0 option to the compiler to disable all optimizations. This ensure the code flow will be as expected and that no code will be optimized out by the compiler.
  • I passed the -tui option to gdb so that it would show the code we are working on (assuming it was compiled with either -g or -ggdb preferably).

Now here is a summary of the most useful commands:

Command Abbreviation Description
run r Run the program. Arguments can be specified.
break b Will pause the program execution at the line or function name specified.
watch wa Will pause the program execution and notify you when the given symbol's value change.
continue c Assuming the program execution is paused, it will resume it.
next n When paused, this command execute the next line of code.
step n Same as next but will "enter" functions.
until u Like next but will run loops to completion.
print p Display the current value of the given symbol.
backtrace bt Will yield a trace of all the function calls that led to that point in the program.
quit q Exit GDB.