Image of git banned.h: Why Git's maintainers have a list of banned standard C library functions

Table of Contents

Introduction

If you take a look at Git's source code, you might notice that there exists a C header file called banned.h.

This file contains a list of C standard library functions that have been banned from use in Git's source code by the Git maintainer and contributor community.

In this article, we'll take a closer look at how the C function ban is implemented, the specific C functions that have been banned from use, and why these functions were banned from use in the codebase.

How does one even implement and enforce a C function ban?

Before we jump into the actual standard C functions included in the ban, let's discuss how a function ban can even be implemented.

Git is written mostly in the C programming language, which is a compiled language. This means that as a precursor step to program execution, a compiler such as gcc or clang is used to compile the source code into machine code which can be understood by the CPU.

This makes possible a handy strategy for implementing community-based source code rules such as Git's function ban:

Find a way to fail the compilation step if a contributor tries to use a banned function anywhere in the source code.

This would not only prevent the user from compiling their code locally for testing, but ideally provide them an understandable notification about why the compilation failed.

How can a banned function make the C compiler fail?

So the next question becomes: "how do you make the compiler throw an error when a specific function is used?"

The C language has a concept called a macro, which is just a named piece of code. Before compilation, the C preprocessor will parse the source files and replace any uses of the name with the corresponding piece of code. This can be useful for defining re-useable program-wide constants and code snippets that will be substituted before the official compilation step.

More specifically, C also provides a subtype of macros called function-like macros, which allow for some dynamic behavior that can be useful for not repeating yourself. Git implements such a function-like macro as follows:

#define BANNED(func) sorry_##func##_is_a_banned_function

This macro states that the name BANNED(func) should be replaced during the preprocessing step with the value sorry_##func##_is_a_banned_function.

Furthermore, the value within parenthesis, in this case func can match a dynamic value supplied later. This will be substituted into the ##func## placeholder in the replacement string sorry_##func##_is_a_banned_function.

So the idea is that if and when a banned function is used in the source code, it will be caught by the preprocessor macro and replaced by the message above, which will cause the compilation step to fail and provide an understandable error message to the user.

So how is the C function-like macro actually used?

Now let's take a look at the first standard C function that was included in the ban, strcpy. The following code snippet was included in a file called banned.h, which was committed to the root directory of Git's source code on July 26th, 2018:

#define BANNED(func) sorry_##func##_is_a_banned_function

#undef strcpy
#define strcpy(x,y) BANNED(strcpy)

#endif /* BANNED_H */

The first line is just the macro definition that we already saw above.

Below that, we can see that a new macro #define strcpy(x,y) BANNED(strcpy) is defined. This is clever because the name of this macro is the actual function that is to be replaced, in this case strcpy(x,y) followed by the value of the macro which is the name of the function-like macro defined previously BANNED(strcpy), passing in the dynamic value strcpy.

This is pretty cool because it effectively chains together multiple macros that will have the effect of replacing the banned function strcpy with the text sorry_strcpy_is_a_banned_function. This would cause the program to fail compilation, displaying the customized message to the user in the process.

Why was the strcpy function banned from Git's source code?

According to the inline code comments in the banned.h header file, functions get included in the banned list for the following overarching reason:

This header lists functions that have been banned from our code base, because they're too easy to misuse (and even if used correctly, complicate audits). Including this header turns them into compile-time errors.

In C, functions are typically easy to misuse when they have unexpected results due to the potential of performing unsafe memory operations when the programmer isn't very careful. This is the case for functions like strcpy().

The strcpy() documentation states that this function will:

copy a string and return a pointer to the end of the result

The synopsis for the function is as follows:

char *stpcpy(char *restrict s1, const char *restrict s2);

Note that the documentation even states that "If copying takes place between objects that overlap, the behavior is undefined."

The problem occurs when a string s2 of n bytes is copied into a string s1 that has been allocated m bytes, when m < n. In this case the copied content will overrun the length of s1 and won't be null terminated, resulting in a potentially nasty bug. This won't necessarily crash the program (although it could) and could be difficult to troubleshoot and fix depending on the resulting behavior.

See the Git mailing list for more details on the initial ban of strcpy().

What to use instead of strcpy()?

It is preferable to use a function that better protects against buffer overruns and also ensures that the copied string always terminates with a null byte. This can be accomplished using the snprintf() function.

int snprintf (	char * buf,
 	size_t size,
 	const char * fmt,
 	...);

Note that when using this function, the programmer directly specifies the size of the buffer to copy data into, including the trailing null byte will be added automatically. This mitigates the possibility of buffer overruns and non-null-terminated strings.

Other C standard library functions banned by Git

Here is the full list of C standard library functions banned by Git, as of the publish date of this article:

#ifndef BANNED_H
#define BANNED_H

/*
 * This header lists functions that have been banned from our code base,
 * because they're too easy to misuse (and even if used correctly,
 * complicate audits). Including this header turns them into compile-time
 * errors.
 */

#define BANNED(func) sorry_##func##_is_a_banned_function

#undef strcpy
#define strcpy(x,y) BANNED(strcpy)
#undef strcat
#define strcat(x,y) BANNED(strcat)
#undef strncpy
#define strncpy(x,y,n) BANNED(strncpy)
#undef strncat
#define strncat(x,y,n) BANNED(strncat)

#undef sprintf
#undef vsprintf
#define sprintf(...) BANNED(sprintf)
#define vsprintf(...) BANNED(vsprintf)

#undef gmtime
#define gmtime(t) BANNED(gmtime)
#undef localtime
#define localtime(t) BANNED(localtime)
#undef ctime
#define ctime(t) BANNED(ctime)
#undef ctime_r
#define ctime_r(t, buf) BANNED(ctime_r)
#undef asctime
#define asctime(t) BANNED(asctime)
#undef asctime_r
#define asctime_r(t, buf) BANNED(asctime_r)

#endif /* BANNED_H */

You can check out the current list by visiting the official list of banned C functions in Git's source code.

Summary

In this article, we discussed the set of banned C functions in Git's source code and explained how the ban is implemented using the banned.h header file in Git's code. We saw that this ban was implemented in 2018 and has evolved a bit since then to include a variety of C functions. We also covered that the main reason for the function bans is to prevent the use of functions that repeatedly introduce the same issues due to their ease of misuse, especially when these issues can lead to tricky memory bugs that can be a pain to troubleshoot.

Next Steps

If you're interested in learning more about how Git works under the hood, check out our Baby Git Guidebook for Developers, which dives into Git's code in an accessible way. We wrote it for curious developers to learn how Git works at the code level. To do this, we documented the first version of Git's code and discuss it in detail.

We hope you enjoyed this post! Feel free to shoot me an email at jacob@initialcommit.io with any questions or comments.

References

  1. Official list of banned functions in Git's source code - https://github.com/git/git/blob/master/banned.h
  2. pubs.opengroup strcpy() documentation - https://pubs.opengroup.org/onlinepubs/9699919799/functions/strcpy.html
  3. Git mailing list strcpy() ban reference - https://lore.kernel.org/git/20180719203259.GA7869@sigill.intra.peff.net/
  4. Kernel snprintf() documentation - https://www.kernel.org/doc/htmldocs/kernel-api/API-snprintf.html