In the first post in this series we covered how to install Git, some essential configuration and a basic Git workflow. In the previous post we covered how to instruct a repository to ignore files or directories and how to view changes made to files over time. In this post we are going to cover how to undo changes to files, staged changes and commits.
No need to find a stasis leak to alter history - undo mistakes using the Git commands discussed in this post
Your average desk jockey will save their work every few minutes while using any given desktop program. Data scientists and developers might supersede “Cmd + S”, or “Ctrl + S” if you’re not a Mac user 🤐 , with git add filename
to periodically save the most recent changes made to a specific file to the staging area. If you stage a file and decide you want to unstage it immediately, you can do this using git reset HEAD
.
Let’s imagine you have made some changes to an analysis script, you haven’t staged them, but you decide you want to undo them. Your text editor might be able to do this; “Cmd + Z” or “Ctrl + Z” for the shortcut junkies out there. Sometimes this won’t be possible. However, Git can also do this for you by using git checkout -- filename
. This will discard all changes to the file specified by filename
that have not yet been staged. Note that filename
can be either the name of a file in your current working directory or a path to a specific file.
Caution: once you discard changes in this way, they are gone forever.
By combining the ability of git reset
to unstage files with git checkout
, you can undo the changes to a file that you have staged:
# unstage the file
git reset HEAD filename
# revert the changes made to the file since the last commit
git checkout -- filename
As before, filename
can be either the name of a file in your current working directory, or a path to a specific file located somewhere on your system.
As we have just seen, git checkout
can be used to undo the changes made to a file since the last commit. This command can also be used to go back even further into a file’s history and restore versions of that file from a commit. In this case, the commit represents a saved version of your work that can be loaded using the git checkout
command.
The syntax for restoring an old version takes two arguments: the hash that identifies the version you want to restore, and the name of the file. Recall from last time that we can use git log
to view repository history. For example, viewing my git_tutorial history might produce this output:
commit e205be814ab8883e8a6bfa873d4495946b0771b9
Author: lquayle88 <drlquayle@gmail.com>
Date: Wed Apr 27 07:00:00 2022 +0100
modified example.txt
commit ee4476a0fde3b9e5df5d95946b0771b9e205be81
Author: lquayle88 <drlquayle@gmail.com>
Date: Wed Apr 6 07:00:00 2022 +0100
added example.txt
If I wanted to revert example.txt
from the current version to the version that existed on April 6th 2022 I could use git checkout ee4476a0 example.txt
. You might have noticed that this is the same syntax that you used to undo the unstaged changes, except “--” has been replaced by a commit hash.
Restoring a file in this way doesn’t erase any repository history; the act of restoring the file is saved as another commit because you might later want to undo your undoing 🧐 .
Pro Tip: recall from last time that git log -n x filename
allows us to view the last x
commits for the file specified by filename
. This can be handy when reverting a specific file to a previous version without having to trawl through the entire repository commit log.
So far, we have discussed how to undo changes one file at a time. However, you might sometimes want to undo changes to many files simultaneously.
Given that HEAD is the default commit and always refers to the most recent commit, we can simply use git reset
without providing any additional arguments to unstage all changes in the current repository. Alternatively we could pass a directory as an argument to unstage all changes to files in that directory. For example, git reset HEAD git_tutorial
would unstage any files from my git_tutorial directory. Running git checkout -- git_tutorial
would then restore the files in my git_tutorial directory to their previous state i.e. the state in which they existed at the most recent commit. We cannot omit an argument with git checkout
to lazily revert everything in the repository to their previous state. However, because we can refer to the current directory using a period “.”, we can revert all files in the current directory using the command git checkout -- .
There will come a time when you just straight-up want rid of a file. To remove a file and simultaneously stage it’s removal we can use git rm filename
. As you might have guessed by now, filename
can be either the name of a file in your current working directory or a path to a specific file.
In this post we have covered how to unstage files with git reset
, revert files to previous states using git checkout
, and how to delete files using git rm
.
The next post in this series will cover basic use of branches in Git. See you then.
Thanks for reading. I hope you enjoyed the article and that it helps you to get a job done more quickly or inspires you to further your data science journey. Please do let me know if there’s anything you want me to cover in future posts.
Happy Data Analysis!
Disclaimer: All views expressed on this site are exclusively my own and do not represent the opinions of any entity whatsoever with which I have been, am now or will be affiliated.