git bundle | A Guide for Bundling Git Repos

Introduction
What is a Git Bundle?
Git Bundle Example
Git Bundle Range of Commits
Can You Push to a Git Bundle?
Git Bundle For Backing Up Repos
Git Pack Files
Conclusion
Next steps
References

Introduction

Git bundle is a utility included with Git that allows you to package an entire repository into a single file quickly. This can be helpful when you want to share work with other developers in an offline or secure setting.

What makes Git bundle so valuable is that it comes with various options and ways to use it. You can bundle an entire repository, branch, or even just a specific commit. Once you generate a Git bundle, you can clone or fork it using standard Git commands.

In this article, we will examine the benefits of using bundles in an offline environment and some shortcomings inherent in working with them. We’ll also review a few examples in the terminal to demonstrate typical usage.

What is a Git Bundle?

Bundle files can consist of entire Git repositories, specific branches, or a set of commits within a branch.

A Git bundle is a binary file created by using the git bundle command. Because bundles are native to Git, you can use all of the standard Git commands on them.

For example, you can run git clone myRepo.bundle to create a working copy of the Git repository, or git fetch myBundle.repo to import a specific range of commits from another copy of the repository.

Git Bundle Example

Creating Git bundles is a reasonably straightforward process. For this example, assume we have a repo named exbundle, with two branches named master and feature.

Let’s review how to bundle the entire Git repository, a single branch, and also a range of commits. Next, we’ll demonstrate how to extract the bundle via git clone and a technique for importing updates to the origin repo.

Git Bundle Entire Repository

You can create a bundle for an entire Git repository using git bundle with the --all flag. The syntax is as follows:

$ git bundle create fullRepo.bundle --all
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0

Git Bundle Single Branch

When bundling a repository, you may only want to include a single branch. To bundle only the master branch, use the following syntax:

$ git bundle create master.bundle HEAD master
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0

Once you have created your bundle, you can copy it to the target machine via a thumb drive or any other method you choose. Once it's copied in the terminal, you can navigate to the directory containing the bundle and clone the repository with the git clone command like so:

$ git clone master.bundle
Cloning into 'master'...
Receiving objects: 100% (3/3), done.

At this point, you have a working copy of the repository. But what if you wanted to make some changes and send them back to the origin repo?

You can do this by bundling only the new range of commits, copying it to the origin machine, and fetching the new commits from the bundle into a temporary branch. You can then use the Git merge command on this temporary branch to merge it back into the master branch.

First, let's view an example demonstrating how to bundle the new commits:

Git Bundle Range of Commits

Assume you've made two commits in your new working copy that you'd like to transfer back to the origin repo. We can view the difference between the new working copy and the origin by selecting a range of commits using the Git log command like so:

$ git log --oneline origin/master..master
d770a17 (HEAD -> master) Commit 2 in cloned repo.
4ce866d Commit 1 in cloned repo.

The output shows that we are two commits ahead of the origin branch. You can now use this same range selection syntax with the git bundle command:

$ git bundle create patch.bundle origin/master..master
Enumerating objects: 8, done.
Counting objects: 100% (8/8), done.
Compressing objects: 100% (2/2), done.
Total 6 (delta 0), reused 0 (delta 0), pack-reused 0

Finally, you can transfer the patch.bundle file into the root directory containing the origin repo.

Unbundling the Git Bundle

After moving the file successfully, create a new branch in your origin repo called temp and check it out. Next, run the git fetch command against your patch bundle like so:

$ git fetch -u patch.bundle master:temp
Receiving objects: 100% (6/6), done.
From patch.bundle
   34109e8..d770a17  master     -> temp

You have successfully imported your new commits into the temp branch. As a side note, the -u flag was included to give our fetch command permission to modify the HEAD of the current branch.

To finish the import, switch back to your master branch and merge the temp branch:

$ git switch master
$ git merge temp
Updating 34109e8..d770a17
Fast-forward
 file1 | 4 ++++
 1 file changed, 4 insertions(+)

And that's it; we've completed the process of updating the origin branch with changes made from the new working copy.

Can You Push to a Git Bundle?

Unfortunately, you cannot run Git push directly against a bundle file. A bundle is a compressed binary file and therefore lacks the capacity to utilize such features.

Alternatively, you can always unbundle the repo, push to it, then bundle it again.

Git Bundle For Backing Up Repos

Bundles, clones, and zip files all work for backing up a repository, but bundles are the most convenient storage option.

Similar to bundle files, zip files compress your project into one file. However, zip files aren't optimized for compressing Git objects, and they don't have the added benefit of running Git commands against them, such as git clone.

Using git clone --mirror is also a viable backup option. However, copying a sizeable raw repository can be a laborious task. Since Git repositories can easily consist of tens of thousands of files, copying this from one place to another, or even simply storing it, is less efficient than using a bundle.

Git Pack Files

Git pack files are conceptually similar to bundles in the sense that they are both compressed copies of Git objects. However, the similarities end there.

As a repository grows, large blob files tend to accumulate redundantly. These blobs are each slightly different versions of one another, spanning back to when the file was initially committed from the working tree.

During its garbage collection process, Git computes the deltas (changes) between each of these files and saves them, along with the latest version of the blob, inside of pack files. The goal is to go from multiple files down to one.

This space optimization feature can significantly shrink the repo size while maintaining the integrity of your project's history. Pack files can end up inside a bundle, but they are of separate concern regarding purpose and utility.

Conclusion

Git bundles are viable when working with repositories in an offline environment. Whether your internet is down, company policies are strict, or you work in a secure environment, bundles can be used to transfer a repo's data into and out of the location in question.

Commands for copying repository data, such as Git clone and Git fetch, work seamlessly with bundle files. Although you cannot push directly to a bundle, the process of fetching and creating new bundles is simple so that most needs can be reasonably met.

It may be a rare case where bundle files are needed. However, if that day ever comes, you can now feel confident that Git bundles are a flexible enough option to carry you through it.

Next steps

If you're interested in learning more about how Git works under the hood, check out our Decoding Git Guidebook for Developers, which dives into Git's code in an accessible way. We wrote it for curious developers to learn how Git works at the code level. To do this, we documented the first version of Git's code and discuss it in detail.

We hope you enjoyed this post! Feel free to shoot me an email at jacob@initialcommit.io with any questions or comments.