Image of Git Rebase | A Guide to Rebasing in Git

ADVERTISEMENT

Table of Contents

Introduction

"Rebase is at its essence simply taking a series of commits as if they were individual patches, and applying them at a different point in history" - Gabe da Silveira

Git Rebase is one of the most commonly used ways to rewrite commit history on a branch. You can use it to clean up a feature branch prior to publishing, or to incorporate new commits from another branch. It is an integral part of several Git workflows, notably Git Flow. Furthermore, Git rebase is helpful at any stage in the Git life cycle because it allows you to maintain clean and readable code with a linear project history.

Let's take a closer look!

Background

A basic understanding of Git would be useful before reading this article. This page also assumes a basic knowledge of branches and commits in Git.

The rebase command is technically just a history rewriting command. In practice though, you can use it to integrate changes from one branch into another. You can think of branches like stacks of lego blocks (each block is a commit). Conceptually, rebase can move one stack of blocks onto another stack of blocks. One complication is that rebase can change the stack of blocks while moving it. More on that later.

There are two primary use cases of Git Rebase that we will go over:

  1. When you want to integrate the latest master branch commits into a feature branch. Doing this enables fast-forward merge when the feature branch is finally merged back into master while also adding the latest changes from master into the feature branch.
  2. Cleaning up a feature branch history before publishing

A fast-forward merge is a highly efficient tool for coordinating the main branch of your project with changes you make on a feature branch. It does this by moving the main branch point to the feature branch pointer without an additional merge commit.

Is Git Rebase Evil?

"People can (and probably should) rebase their _private_ trees (their own work). That's a _cleanup_. But never other peoples code. That's a 'destroy history'." - Linus Torvalds

You may have heard before that rebasing is dangerous and a bad idea. As usual, the devil is in the details.

If you are working in a private Git branch or private repository, it is perfectly acceptable to use Git rebase. Generally speaking, if the branch you are working on is not published to a remote repository, you can get away with using Git rebase. This is because you have the only copy of the history, and nobody else is depending on it.

It is often ok to rebase a feature branch that is shared with a small team communicating closely. Problems may still occur, but close communication will make it workable.

It is generally never ok to rebase a public branch. Other people may have a copy of the history, and they will have no easy way of resolving the inevitable conflicts if you Git rebase at this point.

Benefits of Rebasing

Using Git rebase will ensure your project maintains a clean, linear project history. This can prove valuable later in the event you are trying to uncover a bug using Git bisect.

If you are committing as often as you should, you may end up with some wacky commit messages on your feature branch. Rebasing allows you to modify your commits and commit messages with squash, fixup, or other commands. Doing this makes maintaining a clean history easier.

Rebasing can also ensure a fast-forward merge which is essential in the Git Flow workflow. A fast-forward merge also provides more confidence that the code will still work after the merge, as the code will be identical before and after the merge.

Why You Should Not Use Git Rebase (Drawbacks)

Rewriting history when pushing to a remote repository can be a problem. Git will detect the history mismatch and will reject the push. You will need to use Git push <your_branch_name> --force to make it work. Additionally, if someone pushed commits in the meantime, those commits will be nuked.

If the rebased branch has multiple commits that change the same line and somebody also changed that line in the base branch, then you might need to resolve merge conflicts for that same line multiple times, which you never need to do when merging. On average, there will be more merge conflicts to resolve. This issue may be alleviated by rebasing and squashing on the merge-base first, before the regular rebase:

$ git merge-base master feature1
e7e77cbdd51182fc6818b946d8e50758183ad2d8
$ git checkout feature1
$ git rebase -i e7e77cbdd51182fc6818b946d8e50758183ad2d8 # squash down to one commit here. This will never have any conflicts
$ Git rebase master # any conflicts will only need to be resolved once. Resolving only against the final version of the feature code is also easier.

Note: rerere is another possible solution to this issue, but it is less intuitive and requires more setup.

Finally, a rebase moves commits (re-executes them), so the Git commit date of all moved commits will be the time of the rebase, and the Git history loses the initial commit time. If you need to retain the exact date of a Git commit, then merge is the better option. The same issue happens for author data and other commit metadata.

Use Case 1: Preparing for Fast Forward Merge of Feature Branch Into Master

Rebase can be used to fast-forward merge two branches that have diverged [3].

Let's look at an example (' denotes same content but different SHA):

Step 1. You git checkout a feature branch and make some commits. Commit B is the merge base or latest common ancestor between master and feature1

A-B (master, origin/master)
 \
 C-D-E (feature1, origin/feature1)

Step 2. You add any new commits to master. You'll notice that the two branches have now diverged. You can detect this by running git fetch -a which will show that master has changed. Then run git log to see the changed commits.

A-B-F-G (master, origin/master)
 \
 C-D-E (feature1, origin/feature1)

Step 3. Run Git rebase master from the feature1 branch. Use git status to view conflicts as they come up.

A-B-F-G (master, origin/master)
  \
  C'-D'-E' (feature1, origin/feature1) 

While rebasing, after resolving any conflicts and adding them to the staging area, run git rebase --continue to move on.

Made a mistake and want to quit? Run Git rebase --abort prior to the rebase finishing to reset things back to before the rebase started and exit the rebase context.

Note: While rebasing, you may run into merge conflicts. Merge conflicts just mean that Git needs help resolving a merge. In this case, Git will inject text into the affected files like the following:

<<<<<<< # Indicates the start of the lines that had a merge conflict.
# code in this section is current (from the anonymous branch that will become the rebase result)
======= # Indicates separation of the two conflicting changes.
# code in this section is incoming (from the branch that is being rebased)
>>>>>>> # Indicates the end of the lines that had a merge conflict.

What To Do After Rebasing?

You already read about the drawbacks of force pushing after a rebase. A safer way is to always use the --force-with-lease flag, instead of --force, which will reject the push if newer code exists on the remote.

A common follow-up to do after a Git rebase is to open a merge request to merge the feature branch into master. This merge request will be able to fast forward merge.

Also, if you are working with a small team, make sure you communicate that the rebase has happened.

Use Case 2: Cleanup: Squash, Fixup, Reorder, etc

Rebase can be used to clean up your commit history and even to change multiple commit messages (4).

This is often combined with use case 1.

Use git rebase --interactive or git rebase -i to open the editor.

pick 869d84f4 C
pick 964f1f4e D
pick 0772043e E

# Rebase 08c1230b..0772043e onto 08c1230b (3 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup [-C | -c] <commit> = like "squash" but keep only the previous
# commit's log message, unless -C is used, in which case
# keep only this commit's message; -c is same as -C but
# opens the editor
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# . create a merge commit using the original merge commit's
# . message (or the oneline, if no original merge commit was
# . specified); use -c <commit> to reword the commit message
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#

Reorder the lines and change the commands from the default (pick) to whatever you need. For example, to squash everything and get an opportunity to edit the commit message, change it to:

pick 869d84f4 C
s 964f1f4e D
s 0772043e E

What Is the Difference Between Merge and Rebase?

Rebase is not a very good substitution for merging. Most teams will use both together to integrate changes safely and efficiently. While merge does not rewrite or clean history, it does add a merge commit and retain metadata. On the other hand, rebase does not add a merge commit or retain metadata, but it does rewrite and clean history.

What Is the Difference Between Git Pull and Git Pull --Rebase?

Both are used to integrate changes from the remote into the local branch. If the current branch is behind the remote, then by default it will fast-forward the current branch to match the remote. If the current branch and the remote have diverged then the pull may be rejected. In this case, specify the --rebase flag.

Dealing With When Git Fetch Shows a Forced Change but You Already Committed Locally

In this case, you will need to transplant your changes from the local branch to the remote. You can accomplish this with Git rebase --onto

State before forced push:

A-B-C-D (origin/feature1)
  \
  E (feature1)

State after forced push (someone squashed B, C, and D into B'):

A-B' (origin/feature1)

A-B-C-D-E (feature1)

Use git log to identify the SHA of your oldest unpublished commit (commit E).

Use git rebase --onto origin/feature1 <Commit E SHA> feature1 which results in:

A-B' (origin/feature1)
 \
 E (feature1)

When to use Git pull, Git rebase, or Git merge?

Each one of these commands is complex enough to be nuanced. However, there are rules of thumb for their usage.

Use Git pull when you want to get the latest changes from your own remote branch. Technically you can use this to incorporate changes from another branch, but that usage is less intuitive.

Use Git merge when you want to integrate the latest changes from another branch. For example, to integrate data from a feature branch into master.

Use Git rebase when you want to clean up history or before merging so that the merge will fast-forward. For example, it is common to rebase a feature branch on master prior to merging the feature branch into master.

Should I always Git rebase?

Should you always rebase before pushing? Not necessarily. First, ask yourself what you would gain from it and also who else may be impacted if you rebase.

If you are working on a private project, you may rebase as often as you wish. However, if you are working on a team project, you should consult the contributor's guide or talk with your team members to gain consensus. Some teams prefer branches to be rebased before opening a merge request.

If you are working on a public project such as an open source software on Github, then rebasing would be a rare exception due to the hassle it could cause other contributors.

Summary

Git rebase is an effective tool for modifying your commits, with commands like squash, fixup, and reorder. Though it may not be right to use all the time, Git rebase can come in handy when you are working in a private Git repository or on a small team. Furthermore, rebase makes it easy to clean up a branch before merging in Git Flow, which makes it an essential component in workflows where you want to maintain a clean and readable project history.

We just learned what Git rebase is and why it isn't always a bad thing despite the reputation. Additionally, we went over tactics on how to use Git rebase for various scenarios and avoid common pitfalls.

By now, you should feel comfortable enough to start using some of these concepts on your own projects.

Next Steps

If you're interested in learning more about how Git works under the hood, check out our Baby Git Guidebook for Developers, which dives into Git's code in an accessible way. We wrote it for curious developers to learn how Git works at the code level. To do this we documented the first version of Git's code and discuss it in detail.

We hope you enjoyed this post! Feel free to shoot me an email at jacob@initialcommit.io with any questions or comments.

References

  1. http://www.darwinweb.net/articles/the-case-for-git-rebase
  2. https://lwn.net/Articles/328438/
  3. https://www.youtube.com/watch?v=TymF3DpidJ8
  4. https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#_changing_multiple

Final Notes

Recommended product: Git Guidebook for Developers