Image of Git Fetch | How Git Fetch Works

ADVERTISEMENT

Table of Contents

Introduction

Git was built around a distributed model to offer collaboration freedom. This allows you and your coworkers to checkout any version of the codebase, make changes offline, and later push them to the remote repository so everyone else can view and access them.

To support a distributed architecture, Git’s creator Linus Torvalds developed a repository system to store Git’s internal objects. This local object database uses remote-tracking branches in conjunction with the refspec to download specific commits using the command git fetch. The command git pull takes it one step further, by merging those downloaded commits to your working copy.

Continue reading to learn more about how git fetch works, how git fetch compares to git pull, and how to use git fetch effectively.

What is Git Fetch?

Git fetch is used to update your local repository with changes in the remote, so before diving in it helps to understand how Git links local and remote repositories.

When you clone a remote repository, a local copy is created on your machine which contains the full set of the repository's commits (and other Git objects such as blobs, trees, and tags). The local copy also contains the repository config information. However, by default only the master (or main) branch is set up to track the remote branch.

Git does this by creating a "remote-tracking branch" in the local repository, which you can think of as an intermediate version of the branch that Git uses to keep the local and remote branch copies in sync.

An entry is created for the new origin remote and master branch in the repo Git config file located at .git/config:

[remote "origin"]
        url = https://repo-url
        fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
        remote = origin
        merge = refs/heads/master

This set of remote and branch mappings in the config file is referred to as the refspec. The url field identifies the remote. The fetch field indicates the refspec path to the local ref.

Git separates your personal local repository commits and the remote-tracking branches using branch references, also known as refs. Each branch ref is stored in the hidden .git/ folder - which also contains Git's config - at the following paths:

  • Each local branch ref is stored on the path ./.git/refs/heads/
  • Each remote-tracking branch ref is stored on the path ./.git/refs/remotes/

Each time you use the git fetch command, Git downloads any non-local commits from the linked remote branch into the local repository. These fetched commits are stored in your object database so they exist locally, but are not merged into your current active branch. Therefore, Git fetching is useful when you want to keep your repository up to date, but don’t want the file update to interfere with the current files you are working on. You must later merge to integrate these fetched commits into your current branch.

In a nutshell, Git fetch will only update your local object database with new remote commits. Your local Git working directory remains unaffected.

Git Fetch vs Git Pull

Both git fetch and git pull are used for updating your local repository's object database with commits and tags from a remote repository link. On an active project, the central (remote) repository may receive new commits and tags daily. Remote-tracking branches only update when you use git fetch or pull. The longer you wait between updating your remote-tracking branches, the more outdated they become.

How to Git Fetch Remote Branch?

Git fetch is often useful when you don't want to impact files sitting in your Git working directory or in the staging area. This command won’t manipulate, destroy, or mess up your ongoing work. You can fetch as often as you want, and it won’t ever harm your workflow.

Using git fetch allows for a more careful approach to merging remote-tracking branches. Once you’ve fetched the update, you can check for the differences between your local branches and the remote-tracking branches, using the git diff command. This enables you to verify that these changes won’t conflict with your working files, before merging.

Git Pull Branch from Remote

Newer users are probably more familiar with git pull because it does a lot of the heavy lifting for you.

Under the hood, the git pull command is simply doing a git fetch plus a git merge in one single step.

But git pull has a completely different endpoint than git fetch. When you use git pull you are updating your currently checked-out branch. The updates are not just downloaded to your object database like with git fetch, but merged into your working files.

Since git pull attempts to merge the pulled branch into the active branch, you may end up having to resolve a merge conflict. To avoid this, you can ensure that your working directory is clean before running Git pull. You can temporarily unload your changes in the working directory using the git stash command.

Git Fetch Options

Like most Git commands, there are several useful Git fetch options and flags:

  • git fetch <remote>: Fetches all commits and related objects from all branches from the specified remote's url, such as git fetch origin. If unspecified, the default remote is origin.
  • git fetch <remote> <branch>: Fetches all commits and related objects the specified remote branch.
  • git fetch --all: Fetches all commits, remote tag refs, and related objects from all registered remotes and their associated branches.
  • git fetch --dry-run: The --dry-run option will output the actions that will happen if you use the fetch command, without actually running the fetch command. -git fetch --tags: Fetch remote tags in addition to what is fetched by the normal command.

How to Use Git Fetch?

There is a general workflow that is recommended when using git fetch. Start with git fetch, then check the differences between repositories, and finally merge the fetched changes into your desired branch. To learn the workflow, follow the steps below:

1. Update your Local Repository using Git Fetch

Before using git fetch you may need to link one or more remote repositories depending on where you want to fetch from. You do this with the git remote command:

$ git remote add sample_repo git@bitbucket.org:sample/sample_repo.git

Now you can perform a remote repository fetch:

$ git fetch sample_repo

In this case, all commits and tags from all branches of sample_repo are now downloaded to your local Git object database. If you only want a specific branch, you can include the branch name after the repo name, as follows:

$ git fetch sample_repo debug_branch 

If you want to integrate this branch into your local working copy, you can checkout the branch via git checkout debug_branch. This checkout will create a local branch with the commits that were fetched, and update the working directory to match that state. This can then be merged into any branch of your choosing by checking out your desired branch to merge into, and running git merge debug_branch.

2. Use Git Diff Master Origin/Master

However, before merging, you may want to examine the actual fetched code changes. The git diff command is a useful way to check code changes between your local branch and remote-tracking branches that were fetched, before proceeding with the merge.

So continuing the example from above, our git diff to compare our local state with the fetched changes on the remote tracking branch will be:

$ git diff sample_repo/debug_branch

diff --git a/debug.txt b/debug.txt 
index 15827f4..8115e72 100644 
--- a/debug.txt 
+++ b/debug.txt 
@@ -1,5 +1,5 @@ 
Err 123
Err 123
Err 404
Err 404
- Err 500
+ Err 203

Note that this is not representative of an actual debug log, but we are using it for demonstration purposes. In the outdated version of debug.txt, line 5 read "Err 500". In the updated version of debug.txt, like 5 has been changed to "Err 203". You would look through what git diff outputs and ensure the changes are what you expect. Address these conflicts before moving on to step 3.

3. Merge Using Git Merge

Once you’ve verified and fixed any potential conflicts between the remote-tracking branches and your working copy, you can move on by using git merge to integrate these two together:

$ git checkout master
$ git merge debug_branch

Git merge will result in an output that displays the files changed and the number of insertions:

Updating 15827f4..8115e72 100644
Fast-forward
Debug.txt | 1 - 1 +
1 file changed, 1 insertion(+), 1 deletion(-)

Now that you've fetched and merged in changes from a remote repository, you've essentially learned how git pull works by doing it the manual way!

Summary

Git fetch is a powerful command to add to your Git toolkit. Git fetch is safer than pull, so use it freely and often to download commits and tags to your object database. Your local working directory is completely untouched by the fetching process. Once you’ve verified the file changes using git diff, should you move forward with merging, which will ultimately lead to the same effect as pull.

Next Steps

If you're interested in learning more about how Git works under the hood, check out our Baby Git Guidebook for Developers, which dives into Git's code in an accessible way. We wrote it for curious developers to learn how Git works at the code level. To do this we documented the first version of Git's code and discuss it in detail.

We hope you enjoyed this post! Feel free to shoot me an email at jacob@initialcommit.io with any questions or comments.

References

  1. Git SCM Docs, git fetch - https://git-scm.com/docs/git-fetch

Final Notes