What is a Makefile? | Turning Git's Source Code into a Program
ADVERTISEMENT
Table of Contents
- Introduction
- Turning Source Code into a Program
- What is Make?
- What is a Makefile?
- Makefile Structure: Targets, Prerequisites (Dependencies), and Recipes
- Executing a Makefile Target
- Example Makefile: Git's Original Makefile
- Makefile Variables - Advanced Make Syntax
- Specifying the Compiler in Makefiles
- Make Targets and Commands
- Baby Git Program Targets
- Linking Header Files in Makefiles
- The clean Target
- The backup Target
- GNU Make Manual
- Summary
- Next Steps
- References
Introduction
In this article, we'll discuss how Make and Makefiles work at a high level, then as an example describe how Git's original Makefile works and walk through it line by line.
Turning Source Code into a Program
Before getting straight into Makefiles, lets briefly cover how source code gets turned into an actual program that can run on a computer. Source code consists of a set of files and folders that contain code. Make is often used for C or C++ programs being compiled on Linux systems. Therefore, it is common to see C source files and cpp source files in projects that use makefiles.
This source code usually needs to be converted into a form that the computer can understand. This process is called compilation or compiling. A program that performs this conversion is called a compiler.
Sometimes the compiler needs to be given certain pieces of information so it can properly do its job. This information may include:
- The names and locations of the source code (input) files to compile
- The set of compiled (output) programs to create
- The names and locations to put the compiled (output) programs
- Whether or not to apply any special options in the compilation process
The process of choosing a compiler, identifying the set of source code files to be included, performing preperation steps, and compiling the code into its final form is called building, or the build process.
What is Make?
Make is a build automation tool. It would be very tedious for a developer to manually run all of the build steps in sequence each time they want to build their program. Build automation tools like Make allow developers to describe the build steps and execute them all at once.
What is a Makefile?
Makefiles are text files that developers use to describe the build process for their programs. The make
command can then be used to conveniently run the instructions in the Makefile.
Makefile Structure: Targets, Prerequisites (Dependencies), and Recipes
The basic structure of a makefile is as follows:
target1: prerequisite1
recipe1
target2: prerequisite2
recipe2
...
Makefiles contain a list of named sets of commands that can be used to perform different actions within your codebase. Each named set of commands is called a makefile rule, and is made up of a makefile target, one or more optional makefile prerequisites (or makefile dependencies), and a makefile recipe.
- A makefile target is a name used to reference the corresponding set of makefile commands to be executed. It often represents a compiled output file or executable.
- A makefile prerequisite or makefile dependency is a source file or another target that is required by the current target.
- A makefile recipe is the set of commands that are executed by a specific target.
- A makefile rule is the combination of target, prerequisite/dependency, and recipe.
A target can either be a file or simply the name for a recipe of commands. When the target acts purely as a name for a set of commands, it is called a phony target. You can think of this kind of like a function name.
Executing a Makefile Target
The default makefile target is the first target listed in the makefile. In the example above this is target1
. This can be executed by simply running make
in the same directory as your makefile.
You can change the default target in your makefile by adding .DEFAULT_GOAL := target2
at the beginning of the file. Now when you run make
in your working directory, the target2
target will execute instead of target1
.
You can explicitly run a specific target at any time by running make <target-name>
. So you could run make target2
to run that target even if it isn't the default.
Oftentimes the first (and default) target in a makefile is the all
target, which calls other needed targets in the makefile in sequence. This is convenient since the developer can just run make
in the working directory and the full build process will be completed without running multiple commands.
Another convention is to define a target called clean
which deletes all of the compiled output files and executables from the last build. This allows the command make clean
to prepare the local filesystem for subsequent builds.
By default, make will output each command in your makefile. You can suppress this by adding an @ sign before each line in the makefile itself.
For example, the line @echo Compiling source files
will still display "Compiling source files" text on the shell, but will not show the echo command itself. Similarly the line @touch filename.ext
will still create the file, but it will not display the touch command on the command line as if you typed it in manually.
Example Makefile: Git's Original Makefile
Below is the original Makefile for Git. It is used to invoke the gcc C compiler to build binary executable files for each of the original 7 git commands:
- init-db
- update-cache
- cat-file
- show-diff
- write-tree
- read-tree
- commit-tree
This Makefile can be invoked in 3 variations (referred to as 3 targets), by running the 3 following commands from the command line shell inside the same directory as the Makefile:
- make clean: This removes all previously built executables and build files from the working directory.
- make backup: This first runs
make clean
and then backs up the current directory into a tar archive. - make: This builds the codebase and creates the 7 git executables.
Enough talk - here is the code from Git's first Makefile:
CFLAGS=-g # The `-g` compiler flag tells gcc to add debug symbols to the executable for use with a debugger.
CC=gcc # Use the `gcc` C compiler.
# Specify the names of all executables to make.
PROG=update-cache show-diff init-db write-tree read-tree commit-tree cat-file
all: $(PROG)
install: $(PROG)
install $(PROG) $(HOME)/bin/
# Include the following dependencies in the build.
LIBS= -lssl
# Specify which compiled output (.o files) to use for each executable.
init-db: init-db.o
update-cache: update-cache.o read-cache.o
$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
show-diff: show-diff.o read-cache.o
$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
write-tree: write-tree.o read-cache.o
$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
read-tree: read-tree.o read-cache.o
$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
commit-tree: commit-tree.o read-cache.o
$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
cat-file: cat-file.o read-cache.o
$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
# Specify which C header files to include in compilation/linking.
read-cache.o: cache.h
show-diff.o: cache.h
# Define the steps to run during the `make clean` command.
clean:
rm -f *.o $(PROG) temp_git_file_* # Remove these files from the current directory.
# Define the steps to run during the `make backup` command.
backup: clean
cd .. ; tar czvf babygit.tar.gz baby-git # Backup the current directory into a tar archive.
As seen in the example above, you can add makefile comments by starting the line with a # sign.
Makefile Variables - Advanced Make Syntax
Variables can be defined in the Makefile to hold specific values. In the Makefile above, words such as CFLAGS
and CC
are not special in any way. They are just variable names used to store the values that come after the equals sign. Variable names like $(CFLAGS)
can be used later in the Makefile to substitute in the variable values where needed. This is convenient since we can use a variable name in multiple places, while only updating it in one place if the value changes.
Specifying the Compiler in Makefiles
Git is written in C, so this Makefile is tailored to a C build process using a C-specific pattern.
The first line CFLAGS=-g
specifies the compiler flags - special compiler options - to use during compilation. In this case, the -g
flag tells the compiler to output debug information to the console.
The second line CC=gcc
identifies the actual compiler to use. GCC is the GNU Compiler Collection. It supports compilation of code in several programming languages including C, C++, Java, and more.
Specifying the Executables in Makefiles
The third line defines a build variable called PROG
which contains the names of the executables we'll be creating.
Linking External Libraries in Makefiles
We'll quickly skip ahead to the line which defines the LIBS
variable. This stores the external libraries that we want to link into the build process. In this case, we link in the SSL library which allows Git to access cryptographic functions like hashing.
Make Targets and Commands
Throughout the Makefile, there are multiple lines that start with a keyword followed by a colon such as all:
, install:
, init-db:
, etc. As we mentioned earlier, each of these is called a target. Each target essentially maps to a command that you can specify when running Make, in the form make target
.
For example, if you open a terminal window and browse to this Makefile's directory, you could run the make all
command to run Make on the all
target. Similarly you could run make install
to run Make on the install
target. If no target is specified, the all
target will be used by default.
When Make runs a target, it executes the instructions associated with that target in the Makefile.
The All Target
Back to the Makefile, the all: $(PROG)
line states that, when Make is run without specifying a target, all targets listed in $(PROG) will be executed. Since $(PROG)
lists all 7 of the Baby Git executables, each of them will be executed.
The Install Target
The next target in the Makefile is install
. It is run at the command line using make install
. This starts the same way as the all
target, by specifying the executables to compile using $(PROG)
. But then it uses the install
command to move those built executables into the users home directory.
Baby Git Program Targets
Now for the targets corresponding to the executable names:
init-db:
update-cache:
show-diff:
write-tree:
read-tree:
commit-tree:
cat-file:
Each one of these targets specifies which compiled C object (.o) files we want in each of our executables. Below that each one specifies the compiler command to run based no the build variables specified earlier in the file.
The first executable init-db is very simple since it only includes 1 source file: init-db: init-db.o
The other executables (we'll take update-cache as an example) link together multiple C object (.o) files:
update-cache: update-cache.o read-cache.o
$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
The second line above gets converted to the following after variable substitution:
gcc -g -o update-cache update-cache.o read-cache.o -lssl
Linking Header Files in Makefiles
After the program targets, there are two lines that specify the C header (.h) files to link to each object (.o) file. The only header file in the Baby Git codebase is cache.h
, which gets linked to read-cache.o
and show-diff.o
. C header files typically contain function definitions and function declarations that will be included in multiple files in the codebase.
The clean Target
This target is invoked using make clean
and simply deletes all compiled code and executables from the working directory. It leaves the source files alone so that the program can be built again.
The backup Target
This target is invoked using make backup
. First it invokes the clean
target. Then it backs up the source code files in the working directory as a tar archive in the parent directory.
GNU Make Manual
A valuable resource when working with make is the GNU Make Manual. It contains some of the official make overviews and documentation of make features and make functionality that we couldn't cover in this article.
Summary
In this article we discussed the basic concepts, terminology and structure of makefiles. We learned about an example of how Git's first Makefile works line by line. We hope it helped you understand how Makefiles work and how they are implemented in practice.
Next Steps
If you're interested in learning more about how Git works under the hood, check out our Baby Git Guidebook for Developers, which dives into Git's code in an accessible way. We wrote it for curious developers to learn how Git works at the code level. To do this we documented the first version of Git's code and discuss it in detail.
References
- GNU Make Manual - https://www.gnu.org/software/make/manual/make.html
Final Notes
Recommended product: Decoding Git Guidebook for Developers