Thursday, May 23, 2013

Notes on GCC and Make
yet another insignificant Programming Notes
by ehchua

1.  GCC (GNU Compiler Collection)

1.1  A Brief History and Introduction to GCC

The original GNU C Compiler (GCC) is developed by Richard Stallman, the founder of the GNU Project.
  1. GNU Compiler Collection (GCC): a compiler suit that supports many languages, such as C/C++, Objective-C and Java.
  2. GNU Make: an automation tool for compiling and building applications.
  3. GNU Binutils: a suit of binary utility tools, including linker and assembler.
  4. GNU Debugger (GDB).
  5. GNU Autotools: A build system including Autoconf, Autoheader, Automake and Libtool.
  6. GNU Bison: a parser generator (similar to lex and yacc).

1.2  Installing GCC

MinGW (Minimalist GNU for Windows) is a software port of the GNU Compiler Collection (GCC) and GNU Binutils for use in Windows. It also included MSYS (Minimal System), which is basically a Bourne shell (bash).
Cygwin GCC
Cygwin is a Unix-like environment and command-line interface for Microsoft Windows. Cygwin is huge and includes most of the Unix tools and utilities. It also included the commonly-used Bash shell.

1.3  Getting Started

To compile the hello.c:
> gcc hello.c
  // Compile and link source file hello.c into executable a.exe
The default output executable is called "a.exe".
To run the program:
// Under CMD Shell
> a
// Under Bash or Bourne Shell - include the current path (./)
$ ./a
NOTES (for Bash Shell, Bourne Shell and Unixes):
  • In Bash or Bourne shell, the default PATH does not include the current working directory. Hence, you may need to include the current path (./) in the command. (Windows include the current directory in the PATH automatically; whereas Unixes do not - you need to include the current directory explicitly in the PATH.)
  • In some Unixes, the output file could be "a.out" or simply "a". Furthermore, you may need to assign executable file-mode (x) to the executable file "a.out", via command "chmod a+x filename" (add executable file-mode "+x" to all users "a+x").
To specify the output filename, use -o option:
> gcc -o hello.exe hello.c
  // Compile and link source file hello.c into executable hello.exe
> hello
  // Execute hello.exe under CMD shell
$ ./hello
  // Execute hello.exe under Bash or Bourne shell, specifying the current path (./)
NOTE for Unixes: In Unixes, you may omit the .exe file extension, and simply name the output executable as hello. You need to assign executable file mode via command "chmod a+x hello".
> gcc -o hello hello.c

> g++ -o hello.exe hello.cpp
   // Compile and link source hello.cpp into executable hello.exe
> hello
   // Execute under CMD shell
$ ./hello
   // Execute under Bash or Bourne shell, specifying the current path (./)

More GCC Compiler Options
A few commonly-used GCC compiler options are:
$ g++ -Wall -g -o Hello.exe Hello.cpp
  • -o: specifies the output executable filename.
  • -Wall: prints "allwarning messages.
  • -g: generates additional symbolic debugging information for use with gdb debugger.
Compile and Link Separately
// Compile-only with -c option
> g++ -c Hello.cpp
// Link object file(s) into an executable
> g++ -o Hello.exe Hello.o
The options are:
  • -ccompile into object file "Hello.o". 
  • -o: Linking is performed when the input file are object files ".o
Compile and Link Multiple Source Files
Suppose that your program has two source files: file1.cppfile2.cpp. You could compile all of them in a single command:
> g++ -o myprog.exe file1.cpp file2.cpp 
However, we usually compile each of the source files separately into object file, and link them together in the later stage. In this case, changes in one file does not require re-compilation of the other files.
> g++ -c file1.cpp
> g++ -c file2.cpp
> g++ -o myprog.exe file1.o file2.o
Compile into a Shared Library
To compile and link C/C++ program into a shared libary (".dll" in Windows, ".so" in Unixes), use -shared option. Read "Java Native Interface" for example.

1.4  GCC Compilation Process

GCC compiles a C/C++ program into executable in 4 steps as shown in the above diagram. For example, a "gcc -o hello.exe hello.c" is carried out as follows:
  1. Preprocessing: via the GNU C Preprocessor (cpp.exe), which includes the headers (#include) and expands the macros (#define).
    > cpp hello.c > hello.i
    The resultant intermediate file "hello.i" contains the expanded source code.
  2. Compilation: The compiler compiles the preprocessed source code into assembly code for a specific processor.
    > gcc -S hello.i
    The -S option specifies to produce assembly code, instead of object code. The resultant assembly file is "hello.s".
  3. Assembly: The assembler (as.exe) converts the assembly code into machine code in the object file "hello.o".
    > as -o hello.o hello.s
  4. Linker: Finally, the linker (ld.exe) links the object code with the library code to produce an executable file "hello.exe".
    > ld -o hello.exe hello.o ...libraries...
Verbose Mode (-v)
You can see the detailed compilation process by enabling -v (verbose) option. For example,
> gcc -v hello.c -o hello.exe
Defining Macro (-D)
You can use the -Dname option to define a macro, or -Dname=value to define a macro with a value. The value should be enclosed in double quotes if it contains spaces.

1.5  Headers (.h), Static Libraries (.lib.a) and Shared Library (

Static Library vs. Shared Library
A library is a collection of pre-compiled object files that can be linked into your programs via the linker. Examples are the system functions such asprintf() and sqrt().
There are two types of external libraries: static library and shared library.
  1. A static library has file extension of ".a" (archive file) in Unixes or ".lib" (library) in Windows. When your program is linked against a static library, the machine code of external functions used in your program is copied into the executable. A static library can be created via thearchive program "ar.exe".
  2. A shared library has file extension of ".so" (shared objects) in Unixes or ".dll" (dynamic link library) in Windows. When your program is linked against a shared library, only a small table is created in the executable. Before the executable starts running, the operating system loads the machine code needed for the external functions - a process known as dynamic linking. Dynamic linking makes executable files smaller and saves disk space, because one copy of a library can be shared between multiple programs. Furthermore, most operating systems allows one single copy of a shared library in memory to be used by all running programs, thus, saving memory. The shared library codes can be upgraded without the need to recompile your program.
Because of the advantage of dynamic linking, GCC, by default, links to the shared library if it is available.
You can list the contents of a library via "nm filename".
Compiler and Linker Searching for Header Files and Libraries (-I-L and -l)
When compiling the program, the compiler needs the header files to compile the source codes; the linker needs the libraries to resolve external references from other object files or libraries. The compiler and linker will not find the headers/libraries unless you set the appropriate options, which is not obvious for first-time user.
For each of the headers used in your source (via #include directives), the compiler searches the so-called include-paths for these headers. The include-paths are specified via -Idir option (or environment variable CPATH). Since the header's filename is known (e.g., iostream.hstdio.h), the compiler only needs the directories.
The linker searches the so-called library-paths for libraries needed to link the program into an executable. The library-path is specified via -Ldiroption (uppercase 'L' followed by the directory path) (or environment variable LIBRARY_PATH). In addition, you also have to specify the library name. In Unixes, the library libxxx.a is specified via -lxxx option (lowercase letter 'l', without the prefix "lib" and ".a" extension). In Windows, provide the full name such as -lxxx.lib. The linker needs to know both the directories as well as the library names. Hence, two options need to be specified.
Default Include-paths, Library-paths and Libraries
Try list the default include-paths in your system used by the "GNU C Preprocessor" via "cpp -v":
> cpp -v
#include "..." search starts here:
#include <...> search starts here:
 d:\mingw\bin\../lib/gcc/mingw32/4.6.2/include             // d:\mingw\lib\gcc\mingw32\4.6.2\include
 d:\mingw\bin\../lib/gcc/mingw32/4.6.2/../../../../include // d:\mingw\include
 d:\mingw\bin\../lib/gcc/mingw32/4.6.2/include-fixed       // d:\mingw\lib\gcc\mingw32\4.6.2\include-fixed
Eclipse CDT: In Eclipse CDT, you can set the include paths, library paths and libraries by right-click on the project ⇒ Properties ⇒ C/C++ General ⇒ Paths and Symbols ⇒ Under tabs "Includes", "Library Paths" and "Libraries". The settings are applicable to the selected project only.

1.6  GCC Environment Variables

GCC uses the following environment variables:
  • PATH: For searching the executables and run-time shared libraries (
  • CPATH: For searching the include-paths for headers. It is searched after paths specified in -I<dir> options. C_INCLUDE_PATH andCPLUS_INCLUDE_PATH can be used to specify C and C++ headers if the particular language was indicated in preprocessing.
  • LIBRARY_PATH: For searching library-paths for link libraries. It is searched after paths specified in -L<dir> options.

"file" Utility - Determine File Type
> file hello.o
hello.o: 80386 COFF executable not stripped - version 30821
> file hello.exe
hello.exe: PE32 executable (console) Intel 80386, for MS Windows
"nm" Utility - List Symbol Table of Object Files
The utility "nm" lists symbol table of object files
> nm hello.o
00000000 b .bss
00000000 d .data
00000000 r .eh_frame
00000000 r .rdata
00000000 t .text
         U ___main
00000000 T _main
         U _printf
         U _puts
> nm hello.exe | grep printf
00406120 I __imp__printf
0040612c I __imp__vfprintf
00401b28 T _printf
00401b38 T _vfprintf
"nm" is commonly-used to check if a particular function is defined in an object file. A 'T' in the second column indicates a function that is defined, while a 'U' indicates a function which is undefined and should be resolved by the linker.
"ldd" Utility - List Dynamic-Link Libraries
The utility "ldd" examines an executable and displays a list of the shared libraries that it needs. For example,
> ldd hello.exe
ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x77bd0000)
kernel32.dll => /cygdrive/c/Windows/system32/kernel32.dll (0x77600000)
KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll (0x75fa0000)
msvcrt.dll => /cygdrive/c/Windows/system32/msvcrt.dll (0x763f0000)

2.  GNU Make

The "make" utility automation "make" uses a so-called makefile, which contains rules on how to build the executables.
You can issue "make --help" to list the command-line options; or "man make" to display the man pages.

Create the following file named "makefile" (without any file extension), which contains rules to build the executable, and save in the same directory as the source file. Use "tab" to indent the command (NOT spaces).
all: hello.exe

hello.exe: hello.o
  gcc -o hello.exe hello.o

hello.o: hello.c
  gcc -c hello.c
  rm hello.o hello.exe
Run the "make" utility as follows:
> make
gcc -c hello.c
gcc -o hello.exe hello.o
Running make by default starts the target "all" in the makefile. A makefile consists of a set of rules. A rule consists of 3 parts: a target, a list of pre-requsites and a command, as follows:
target: pre-req-1 pre-req-2 ...
 command   <--------------------------inside the makefile
The target and pre-requsites are separated by a colon (:). The command must be preceded by a tab (NOT spaces).
When make is asked to evaluate a rule, it begins by finding the files in the prerequisites. If any of the prerequisites has an associated rule, make attempts to update those first.
In the above example, the rule "all" has a pre-requsite "hello.exe". make cannot find the file "hello.exe", so it looks for a rule to create it. The rule "hello.exe" has a pre-requsite "hello.o". Again, it does not exist, so make looks for a rule to create it. The rule "hello.o" has a pre-requsite "hello.c". make checks that "hello.c" exists and it is newer than the target (which does not exist). It runs the command "gcc -c hello.c". The rule "hello.exe" then run its command "gcc -o hello.exe hello.o". Finally, the rule "all" does nothing.
More importantly, if the pre-requsite is not newer than than target, the command will not be run. In other words, the command will be run only if the target is out-dated compared with its pre-requsites. (increase the efficiency) 
For example, if we re-run the make command:
> make
make: Nothing to be done for `all'.
You can also specify the target to be made in the make command. For example, the target "clean" removes the "hello.o" and "hello.exe". You can then run the make without target, which is the same as "make all".
> make clean  <--------------------------NachOS 
rm hello.o hello.exe
> make
gcc -c hello.c
gcc -o hello.exe hello.o
Try modifying the "hello.c" and run make.
  • If the command is not preceded by a tab, you get an error message "makefile:4: *** missing separator. Stop."
  • If there is no makefile in the current directory, you get an error message "make: *** No targets specified and no makefile found. Stop."
  • The makefile can be named "makefile", "Makefile" or "GNUMakefile", without file extension.


No comments:

Post a Comment