Git lesson 4: Looking at file differences with git diff

Moving on from the previous lesson, you will now learn how to view the differences between files in your repository.
More appropriately, you will learn to view the differences between one version of the files in your repository and another version of the files in your repository. As you get more familiar with Git, you will find yourself using this command all the time. At first, however, you will primarily use git diff to view changes you have made to a file, but have not yet committed. Let me explain with an example.

Showing file differences

If you have been following along with these lessons, you should have a repository called project_repo, which contains a file called doc1.txt. This file should contain a single line of text: `Line 1, doc 1. Branch master project_repo.’. In lesson 2 you added this file to your repository using git add and you committed that version of the file using git commit. You can review your original commit using the git log command; you have a save point that you can always recover if need be.

You are now going to add some text to doc1.txt as well as add a new file, doc2.txt, to your repository.
I will use the echo command to make these changes, but feel free to use your favourite text editor.

echo 'Line 2, doc 1. Branch master in project_repo.' >> doc1.txt
echo 'Line 1, doc 2. Branch master in project_repo.' > doc2.txt 
# Check the status of your repository
git status

If everything went as planned, you should see the following message on the command line:

16:33 ~/.../project_repo$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   doc1.txt

Untracked files:
  (use "git add <file>..." to include in what will be committed)

    doc2.txt

no changes added to commit (use "git add" and/or "git commit -a")

Git has recognized that doc1.txt has changed, and that doc2.txt is a new file in our repository. Before you commit changes to your repository, it is a good idea to view the changes that will be part of the commit. Use the git diff command to view these changes:

git diff
diff --git a/doc1.txt b/doc1.txt
index 2edbc67..1681703 100644
--- a/doc1.txt
+++ b/doc1.txt
@@ -1 +1,2 @@
 Line 1, doc 1. Branch master project_repo.
+Line 2, doc 1. Branch master in project_repo.

The output of this command may seem a little cryptic, but there is no need to get overwhelmed. This output can be used to view file differences in a program that has a graphical interface; this is a very useful feature that will be introduced in a future lesson. For the time being, just focus on the last few lines of the output. They indicate that you have added (+) a line to doc1.txt.

But what about the line of text we added to your file doc2.txt? Well, git diff does not show changes to files it is not currently tracking. Go ahead and add doc2.txt to your repository. Then add doc1.txt so that the changes to both these files will be part of the same commit.

# Multiple files can be added at the same time.
git add doc2.txt doc1.txt  
git commit -m 'Add doc2 to repo, and add line to doc1'
git slog

Seeing line-level and word-level differences

Lets make a few more changes to our files so that we can use git diff some more. We are first going to add a line of text to doc2.txt:

echo 'Line 2, doc 2. Branch master in project_repo.' >> doc2.txt

For the second change, open doc1.txt in a text editor and change the first line so that it reads: ‘Line 1, doc 1. We are on branch master, in project_repo.’ Then add the following line to the end of the file (in the text editor or on the command line):

echo 'Line 3, doc 1. Branch master in project_repo.' >> doc1.txt

Now have a look at changes you made to your two files with git diff:

git diff
diff --git a/doc1.txt b/doc1.txt
index 8ce2b21..cabf64e 100644
--- a/doc1.txt
+++ b/doc1.txt
@@ -1,2 +1,3 @@
-Line 1, doc 1. Branch master project_repo.
+Line 1, doc 1. We are on branch master in project_repo.
 Line 2, doc 1. Branch master in project_repo.
+Line 3, doc 1. Branch master in project_repo.
diff --git a/doc2.txt b/doc2.txt
index 645da70..62d2c03 100644
--- a/doc2.txt
+++ b/doc2.txt
@@ -1 +1,2 @@
 Line 1, doc 2. Branch master in project_repo.
+Line 2, doc 2. Branch master in project_repo.

You can now see the changes you made to both doc1.txt and doc2.txt. Lines that have been added are preceded by +, whereas those that have been removed are preceded by -. Note that git diff sees the changes you made to the first line of doc1.txt as the addition of a new line, with the subtraction of the old line.

This type of thinking works well when the document you are working on is a computer program. Computer code has syntax rules and is very particular about variable and function names, so highlighting changes to a whole line of code is often helpful.

However when you are working on a document such as a manuscript or a thesis, the type of changes you have to make are word-level changes such as adding a missing the, fixing a spelling mistake or adding an adjective. With these types of documents, you may prefer to view word-level changes rather line-level changes. This can easily be done using the git diffc alias we created back in Lesson 0. Go ahead and try it!

Viewing differences of staged files

Now that you have reviewed the changes you made to your files, you can add them to the staging area so they will be part of your next commit (see lesson 2 if you need a reminder about the staging area). Go ahead and add doc1.txt and doc2.txt.

Now that the files have been added, look at the status of the repository and the differences in your files. You should get the following output:

11:33 ~/.../project_repo$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    modified:   doc1.txt
    modified:   doc2.txt

11:33 ~/.../project_repo$ git diff
11:33 ~/.../project_repo$

You definitely changed the files in your repository, but git diff is telling you there are no changes. How can this be? What you have to be aware of is that git diff shows you the differences between your last commit and any changes you have not formally told Git about: i.e., changes that have not been committed or that have not been added to the staging area. However, you can use the --staged option to see the difference between files that you have staged and your previous commit:

git diff --staged

Nicely done! Now go ahead and commit these changes to your repository; you can make you your own commit message.

Summary

You now know how to view changes to files in your repository before you commit them. In addition to regularly reviewing the status of your repository, it is good Git workflow to review your changes before you commit them. The git diff command and git diffc alias are very powerful commands that can do much more than what you have seen here. We will introduce some of these other uses in future lessons.

Leave a comment