Git Fetch | How Git Fetch Works
ADVERTISEMENT
Table of Contents
- Introduction
- What is Git Fetch?
- Git Fetch vs Git Pull
- Git Fetch Options
- How to Use Git Fetch?
- 2. Use Git Diff Master Origin/Master
- Summary
- Next Steps
- References
Introduction
Git was built around a distributed model to offer collaboration freedom. This allows you and your coworkers to checkout any version of the codebase, make changes offline, and later push them to the remote repository so everyone else can view and access them.
To support a distributed architecture, Git’s creator Linus Torvalds developed a repository system to store Git’s internal objects. This local object database uses remote-tracking branches in conjunction with the refspec to download specific commits using the command git fetch
. The command git pull
takes it one step further, by merging those downloaded commits to your working copy.
Continue reading to learn more about how git fetch works, how git fetch compares to git pull, and how to use git fetch effectively.
What is Git Fetch?
Git fetch is used to update your local repository with changes in the remote, so before diving in it helps to understand how Git links local and remote repositories.
When you clone a remote repository, a local copy is created on your machine which contains the full set of the repository's commits (and other Git objects such as blobs, trees, and tags). The local copy also contains the repository config information. However, by default only the master (or main) branch is set up to track the remote branch.
Git does this by creating a "remote-tracking branch" in the local repository, which you can think of as an intermediate version of the branch that Git uses to keep the local and remote branch copies in sync.
An entry is created for the new origin
remote and master branch in the repo Git config file located at .git/config
:
[remote "origin"]
url = https://repo-url
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master
This set of remote and branch mappings in the config file is referred to as the refspec. The url field identifies the remote. The fetch field indicates the refspec path to the local ref.
Git separates your personal local repository commits and the remote-tracking branches using branch references, also known as refs. Each branch ref is stored in the hidden .git/
folder - which also contains Git's config - at the following paths:
- Each local branch ref is stored on the path
./.git/refs/heads/
- Each remote-tracking branch ref is stored on the path
./.git/refs/remotes/
Each time you use the git fetch
command, Git downloads any non-local commits from the linked remote branch into the local repository. These fetched commits are stored in your object database so they exist locally, but are not merged into your current active branch. Therefore, Git fetching is useful when you want to keep your repository up to date, but don’t want the file update to interfere with the current files you are working on. You must later merge to integrate these fetched commits into your current branch.
In a nutshell, Git fetch will only update your local object database with new remote commits. Your local Git working directory remains unaffected.
Git Fetch vs Git Pull
Both git fetch and git pull are used for updating your local repository's object database with commits and tags from a remote repository link. On an active project, the central (remote) repository may receive new commits and tags daily. Remote-tracking branches only update when you use git fetch or pull. The longer you wait between updating your remote-tracking branches, the more outdated they become.
How to Git Fetch Remote Branch?
Git fetch is often useful when you don't want to impact files sitting in your Git working directory or in the staging area. This command won’t manipulate, destroy, or mess up your ongoing work. You can fetch as often as you want, and it won’t ever harm your workflow.
Using git fetch
allows for a more careful approach to merging remote-tracking branches. Once you’ve fetched the update, you can check for the differences between your local branches and the remote-tracking branches, using the git diff command. This enables you to verify that these changes won’t conflict with your working files, before merging.
Git Pull Branch from Remote
Newer users are probably more familiar with git pull because it does a lot of the heavy lifting for you.
Under the hood, the git pull command is simply doing a git fetch plus a git merge in one single step.
But git pull has a completely different endpoint than git fetch. When you use git pull you are updating your currently checked-out branch. The updates are not just downloaded to your object database like with git fetch, but merged into your working files.
Since git pull
attempts to merge the pulled branch into the active branch, you may end up having to resolve a merge conflict. To avoid this, you can ensure that your working directory is clean before running Git pull. You can temporarily unload your changes in the working directory using the git stash command.
Git Fetch Options
Like most Git commands, there are several useful Git fetch options and flags:
git fetch <remote>
: Fetches all commits and related objects from all branches from the specified remote's url, such asgit fetch origin
. If unspecified, the default remote isorigin
.git fetch <remote> <branch>
: Fetches all commits and related objects the specified remote branch.git fetch --all
: Fetches all commits, remote tag refs, and related objects from all registered remotes and their associated branches.git fetch --dry-run
: The--dry-run
option will output the actions that will happen if you use the fetch command, without actually running the fetch command. -git fetch --tags
: Fetch remote tags in addition to what is fetched by the normal command.
How to Use Git Fetch?
There is a general workflow that is recommended when using git fetch. Start with git fetch, then check the differences between repositories, and finally merge the fetched changes into your desired branch. To learn the workflow, follow the steps below:
1. Update your Local Repository using Git Fetch
Before using git fetch you may need to link one or more remote repositories depending on where you want to fetch from. You do this with the git remote
command:
$ git remote add sample_repo git@bitbucket.org:sample/sample_repo.git
Now you can perform a remote repository fetch:
$ git fetch sample_repo
In this case, all commits and tags from all branches of sample_repo
are now downloaded to your local Git object database. If you only want a specific branch, you can include the branch name after the repo name, as follows:
$ git fetch sample_repo debug_branch
If you want to integrate this branch into your local working copy, you can checkout the branch via git checkout debug_branch
. This checkout will create a local branch with the commits that were fetched, and update the working directory to match that state. This can then be merged into any branch of your choosing by checking out your desired branch to merge into, and running git merge debug_branch
.
2. Use Git Diff Master Origin/Master
However, before merging, you may want to examine the actual fetched code changes. The git diff
command is a useful way to check code changes between your local branch and remote-tracking branches that were fetched, before proceeding with the merge.
So continuing the example from above, our git diff
to compare our local state with the fetched changes on the remote tracking branch will be:
$ git diff sample_repo/debug_branch
diff --git a/debug.txt b/debug.txt
index 15827f4..8115e72 100644
--- a/debug.txt
+++ b/debug.txt
@@ -1,5 +1,5 @@
Err 123
Err 123
Err 404
Err 404
- Err 500
+ Err 203
Note that this is not representative of an actual debug log, but we are using it for demonstration purposes. In the outdated version of debug.txt
, line 5 read "Err 500". In the updated version of debug.txt
, like 5 has been changed to "Err 203". You would look through what git diff outputs and ensure the changes are what you expect. Address these conflicts before moving on to step 3.
3. Merge Using Git Merge
Once you’ve verified and fixed any potential conflicts between the remote-tracking branches and your working copy, you can move on by using git merge to integrate these two together:
$ git checkout master
$ git merge debug_branch
Git merge will result in an output that displays the files changed and the number of insertions:
Updating 15827f4..8115e72 100644
Fast-forward
Debug.txt | 1 - 1 +
1 file changed, 1 insertion(+), 1 deletion(-)
Now that you've fetched and merged in changes from a remote repository, you've essentially learned how git pull
works by doing it the manual way!
Summary
Git fetch is a powerful command to add to your Git toolkit. Git fetch is safer than pull, so use it freely and often to download commits and tags to your object database. Your local working directory is completely untouched by the fetching process. Once you’ve verified the file changes using git diff, should you move forward with merging, which will ultimately lead to the same effect as pull.
Next Steps
If you're interested in learning more about how Git works under the hood, check out our Baby Git Guidebook for Developers, which dives into Git's code in an accessible way. We wrote it for curious developers to learn how Git works at the code level. To do this we documented the first version of Git's code and discuss it in detail.
We hope you enjoyed this post! Feel free to shoot me an email at jacob@initialcommit.io with any questions or comments.
References
- Git SCM Docs, git fetch - https://git-scm.com/docs/git-fetch
Final Notes
Recommended product: Decoding Git Guidebook for Developers