Sunday, May 31, 2020

Quick Git command line tutorial

Git is a version control system which facilitates the creation and maintenance of versions in a project. Usually it's used for software code but it can also be used for things like documents. Its most useful for anything that is text based as you will be able to see which lines have changed across versions.

Given that you've installed git, create folder that will store your project, open your terminal (cmd in Windows), navigate to the folder, and turn it into a repository by entering:

git init .

This will create a folder called ".git" in your current directory which lets you do repository stuff to it. Now whenever you want to do repository stuff, just open your terminal and navigate back to the folder. To see this we can ask for a status report of the repo by entering:

git status

This will output the following:

On branch master

Initial commit

nothing to commit (create/copy files and use "git add" to track)

This is basically telling us that the repo is empty. Now we start putting stuff there. Let's create a readme file using markdown with the file name readme.md:

My first git repo
=================

This is my readme file.


After saving, if we go back to the terminal and enter:

git status

we will now see:

On branch master

Initial commit

Untracked files:
  (use "git add ..." to include in what will be committed)

        readme.md

nothing added to commit but untracked files present (use "git add" to track)

This is saying that readme.md is a file in the repo folder that is not being kept track of by git. We can add this file to the git index by entering:

git add readme.md

After asking for the status again, we will now see:

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached ..." to unstage)

        new file:   readme.md

Now the file is in the index and is 'staged'. If we update the file and save it again:

My first git repo
=================

This is my updated readme file.


the status will now say:

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached ..." to unstage)

        new file:   readme.md

Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

        modified:   readme.md

This is saying that we have a staged file that has also been modified. We can check what has been changed in the file since we last staged it by entering:

git diff

diff --git a/readme.md b/readme.md
index 75db517..5bdd78c 100644
--- a/readme.md
+++ b/readme.md
@@ -1,4 +1,4 @@
 My first git repo
 =================

-This is my readme file.
+This is my updated readme file.

Diff shows you all the lines that were changed together with some unchanged lines next to them for context. The '-' line was replaced by the '+' line. We're also told in "@@ -1,4 +1,4 @@" that the line number that was changed was 4 (1 line was removed at line 4, and another line was added at line 4).

Now we stage this modification so that it is also kept track of:

git add readme.md

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached ..." to unstage)

        new file:   readme.md

Staging changes is not the point of repositories. The point is committing the staged changes. A commit is a backup of the currently indexed files. When you take a backup, you make a copy of your project folder and give it a name so that you can go back to it. This is what a commit does. Enter:

git commit

This will open up a text editor so you can enter a description of what you're committing.
  • If the text editor is vim, which you will know because it is not helpful at all, you need to first press 'i' for 'insert' before you type anything. To save, press ESC, followed by ':', followed by 'w', followed by enter. To exit, press ESC, followed by ':', followed by 'q', followed by enter.
  • If it's nano then just follow the on screen commands, noting that '^' means CTRL.

The commit message you write can later be searched in order to find a particular commit. Note that commits are generally thought of as being unchangable after they are made, so make sure you write everything you need to. The first line of the commit message has special importance and is considered to be a summary of what has changed. You should keep it under 50 characters long so that it can be easily displayed in a table. It should be direct and concise (no full stops at the end, for example), with the rest of the lines under it being a more detailed description to make up for any missing information in the first line. A blank line needs to separate the first line from the rest of the lines. Note that it's fine to only have the first line is it's enough.

added the word 'updated' to readme

readme.md was updated so that it says that it is my updated readme file.
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
#
# Initial commit
#
# Changes to be committed:
#       new file:  readme.md

The lines with the # in front were written by git and should be left there. If we check the status again we will now see:

On branch master
nothing to commit, working tree clean

This is saying that there were no changes since the last commit. We can now see all of our commits by entering:

git log

Author: mtanti 
Date:   Sat May 30 10:46:17 2020 +0200

    added the word 'updated' to readme

    readme.md was updated so that it says that it is my updated readme file.

We can see who made the commit, when, and what message was written. It's important that commits are done frequently but on complete changes. A commit is not a save (you do not commit each time you save), it is a backup, and the state of your project in each backup should make sense. Think of it as if you're crossing out items in a todo list and with each item crossed out you're taking a backup. You should not take a backup in between items. On the other hand, your items should be broken down into many simple tasks in order to be able to finish each one quickly.

Now, let's add a folder called 'src' to our repo and then check the status.

On branch master
nothing to commit, working tree clean

In git's eyes, nothing has changed because git does not have a concept of a folder, only of the directory of files. We need to put a file in the folder in order to be able to add it to git's index. Let's add 2 text files to src: 'a.txt' and 'b.txt' with the following content each:

line 1
line 2
line 3


The status now shows:

On branch master
Untracked files:
  (use "git add ..." to include in what will be committed)

        src/

nothing added to commit but untracked files present (use "git add" to track)

This is saying that we have a new folder called 'src' with some files in it. We can add the folder by using the add command. If you want you can avoid having to include the file names of the files you're adding by just using a '.', which means "all unstaged modified or new files":

git add .

On branch master
Changes to be committed:
  (use "git reset HEAD ..." to unstage)

        new file:   src/a.txt
        new file:   src/b.txt

Let's commit these two files.

git commit

added source files

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
# Changes to be committed:
#       new file:   src/a.txt
#       new file:   src/b.txt

Note that I didn't add anything else to the message other than the first line. There is no need to specify what files were added as they can be seen in the commit. Now let's look at the log of commits:

git log

commit 59cfc3f057bf1f19038ab15c4357d97bc84ac30e (HEAD -> master)
Author: mtanti 
Date:   Sat May 30 11:17:14 2020 +0200

    added source files

commit f71f17b63c6b3ddb7506000cbc422e8f1b173958
Author: mtanti 
Date:   Sat May 30 10:46:17 2020 +0200

    added the word 'updated' to readme

    readme.md was updated so that it says that it is my updated readme file.

We can see how all the commits are shown in descending order of when they were made. You might be wondering what 'HEAD' is referring to. HEAD is the commit we are working on. We can now move in between commits and move in time. This is very useful if you start working on something and realise that there was a better way to do it and need to undo all your work up to a particular point in the commit history. When we do this, we would be moving the HEAD to a different commit. The HEAD can be moved by using the checkout command:

git checkout HEAD~1

This is saying "go back one commit behind the HEAD". The command gives the following output:

Note: checking out 'HEAD~1'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b 

HEAD is now at f71f17b... added the word 'updated' to readme

This is saying that we are now in the commit with the message "added the word 'updated' to readme". It is also possible to use the hash as a commit identifier in order to just to it directly without being relative to the HEAD. The hash is the 40 digit hexadecimal next to the commits in the log. For example, the first commit had a hash of 'f71f17b63c6b3ddb7506000cbc422e8f1b173958' so we could have entered "git checkout f71f17b63c6b3ddb7506000cbc422e8f1b173958". We can also just use the first 7 digits to avoid typing everything but it's more likely that there will be a collision with another hash.

If look at the log now, we'll see:

commit f71f17b63c6b3ddb7506000cbc422e8f1b173958 (HEAD)
Author: mtanti 
Date:   Sat May 30 10:46:17 2020 +0200

    added the word 'updated' to readme

    readme.md was updated so that it says that it is my updated readme file.

which shows that the HEAD has been moved one commit behind in the timeline.

Now if we look at the project folder (and refresh), we'll see that the folder 'src' has been physically removed. We can restore it by moving forward in time and go to the latest commit which includes the 'src' folder.

Unfortunately, there is no direct notation for moving forward in time as what we just did is not normal usage of git. Note that at the moment the HEAD is said to be 'detached', which means that it is not in a proper place (the end of the timeline). We can get back to the proper place we should be in by checking out to 'master'.

git checkout master

Checking the log, we now see:

commit 59cfc3f057bf1f19038ab15c4357d97bc84ac30e (HEAD, master)
Author: mtanti 
Date:   Sat May 30 11:17:14 2020 +0200

    added source files

commit f71f17b63c6b3ddb7506000cbc422e8f1b173958
Author: mtanti 
Date:   Sat May 30 10:46:17 2020 +0200

    added the word 'updated' to readme

    readme.md was updated so that it says that it is my updated readme file.

So what is this 'master' business? What we were calling timelines are actually called 'branches' in git, and branches are one of the most important things in git. Imagine you've started working on a new feature in your program. Suddenly you are told to let go of everything you're doing, work on fixing a bug, and quickly release an update right away. The feature you're working on is half way done and you can't release an updated version of the program with a half finished function; but there's no way you'll finish the feature quickly enough. Do you undo all the work you did on the feature so that you're at a stable version of the program and able to fix the bug? Of course not. That's what branches are for.

With branches you can keep several versions of your code available and switch from one to the other with checkout. The master branch is the one you start with. Ideally the master's last commit should always be in a publishable state (no 'work in progress'). Of course if you're just working on the master branch then this would not be possible without taking committing very rarely, which is bad practice. The solution is to have several development branches on which you modify the project bit by bit. Every time you start working on a new publishable version of your project, you start a new branch and work on modifying your project to create the new version. Once you finish what you're doing and are happy with the result, you then merge the development branch into the master branch. If you're in the middle of something and need to fix a bug, you switch to the master branch, create a new branch for fixing the bug, fix it, and merge it to the master. Then you merge the new master code with your earlier development branch so that you can continue working as if nothing happened.

Let's make a development branch. Enter the following:

git branch mydevbranch

This will create a branch called 'mydevbranch' that sticks out from the current branch at the current HEAD, that is, it will create a branch that sticks out from master's last commit. By 'sticks out' I mean that when you switch to mydevbranch, the project will be changed to look like it was at the commit from where the branch is starting from. Alternatively you can include a commit hash after the name of the branch in order to make it stick out from an earlier point in the current branch. For example "git branch mydevbranch f71f17b63c6b3ddb7506000cbc422e8f1b173958" will create a branch from the first commit.

We can see a list of branches by entering:

git branch --list

* master
  mydevbranch

The asterisk shows which branch is currently active (has the HEAD).

Now switch to the new branch using checkout:

git checkout mydevbranch

and check the status:

On branch mydevbranch
nothing to commit, working tree clean

It now says that we're on mydevbranch instead of on master. Note that whilst we're on the new branch, any modifications we make to the project will be stored on the branch whilst master will remain as it is. Let's modify src/a.txt to look like this:

line 1
line 2 changed
line 3


And now add and commit this file and then check the log:

commit 9bc4488ac847bceccb746eeafb1a8c239de350f2 (HEAD -> mydevbranch)                                                                            Author: mtanti                                                                                                              Date:   Sun May 31 10:24:23 2020 +0200                                                                                                                                                                                                                                                                changed line in a.txt                                                                                                                                                                                                                                                                         commit 59cfc3f057bf1f19038ab15c4357d97bc84ac30e (master)                                                                                         Author: mtanti                                                                                                              Date:   Sat May 30 11:17:14 2020 +0200                                                                                                                                                                                                                                                                added source files                                                                                                                                                                                                                                                                            commit f71f17b63c6b3ddb7506000cbc422e8f1b173958                                                                                                  Author: mtanti                                                                                                              Date:   Sat May 30 10:46:17 2020 +0200                                                                                                                                                                                                                                                                added the word 'updated' to readme                                                                                                                                                                                                                                                                readme.md was updated so that it says that it is my updated readme file.

Note how the log shows us where each branch is and where the HEAD is. In this case the HEAD is at mydevbranch's last commit and master is one commit behind it. Let's add another commit after changing src/b.txt:

line 1
line 2
line 3
line 4


Now that we're done with the changes in this new version, we can merge the development branch to master by first switching to master and then merging to mydevbranch:

git checkout master
git merge mydevbranch

Updating 59cfc3f..f962efb
Fast-forward
 src/a.txt | 2 +-
 src/b.txt | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

Note that this was a fast forward merge, which is the best kind of merge you can have. It happens when all the changes were made in a neat sequence which can be simply reapplied on the master, as opposed to having multiple branches changing stuff at the same time.

Before seeing what a merge conflict looks like, let's look at the log again now that we've made the merge, but this time as a graph:

git log --graph

* commit f962efb409e4f08f94d717dec866519bc2848e8f (HEAD -> master, mydevbranch)
| Author: mtanti 
| Date:   Sun May 31 10:30:42 2020 +0200
|
|     added a new line to b.txt
|
* commit 9bc4488ac847bceccb746eeafb1a8c239de350f2
| Author: mtanti 
| Date:   Sun May 31 10:24:23 2020 +0200
|
|     changed line in a.txt
|
* commit 59cfc3f057bf1f19038ab15c4357d97bc84ac30e
| Author: mtanti 
| Date:   Sat May 30 11:17:14 2020 +0200
|
|     added source files
|
* commit f71f17b63c6b3ddb7506000cbc422e8f1b173958
  Author: mtanti 
  Date:   Sat May 30 10:46:17 2020 +0200

      added the word 'updated' to readme

      readme.md was updated so that it says that it is my updated readme file.

Here we can see a neat straight line moving through a timeline of commits, from the first to the last with no splits along the way. The master and development branch are together on the last commit in the timeline. Now let's see how this can be different.

Switch back to the development branch so that you can start working on the next version of the project:

git checkout mydevbranch

and change src/a.txt to have a new line:

line 1
line 2 changed
line 3
line 4


As you're working on the file, you get a call from your boss who tells you to immediately change all the lines to start with a capital letter in both src/a.txt and src/b.txt. Now what? You need to stop working on mydevbranch, go back to master, and create a new branch for working on the new request.

Before switching branches it's important that you do not have any uncommitted changes in your project, otherwise they will be carried over to the master branch and that would make things confusing. If you're not at a commitable point in your change then you can save the current state of the branch files by stashing them instead:

git stash push

This will add a stash object to the current commit of the current branch which can then be retrieved and removed. Next checkout to master:

git checkout master

Create and switch to a new branch.

git branch myurgentbranch
git checkout myurgentbranch

and apply the urgent modifications.

src/a.txt
Line 1
Line 2 changed
Line 3


src/b.txt
Line 1
Line 2
Line 3
Line 4


Commit these changes and merge to master:

git add .
git commit
git checkout master
git merge myurgentbranch

The merge should be a fast forward since we started working on the last commit of master with no interruptions. The log will show us the current situation:

git log --graph

* commit 5dcd492113d3942550a58efdc7b90e15bd36d537 (HEAD -> master, myurgentbranch)                                                               | Author: mtanti                                                                                                            | Date:   Sun May 31 11:06:39 2020 +0200                                                                                                         |                                                                                                                                                |     capitalised each line in source files                                                                                                      |                                                                                                                                                * commit f962efb409e4f08f94d717dec866519bc2848e8f (mydevbranch)                                                                                  | Author: mtanti                                                                                                            | Date:   Sun May 31 10:30:42 2020 +0200                                                                                                         |                                                                                                                                                |     added a new line to b.txt
|
* commit 9bc4488ac847bceccb746eeafb1a8c239de350f2
| Author: mtanti 
| Date:   Sun May 31 10:24:23 2020 +0200
|
|     changed line in a.txt
|
* commit 59cfc3f057bf1f19038ab15c4357d97bc84ac30e
| Author: mtanti 
| Date:   Sat May 30 11:17:14 2020 +0200
|
|     added source files
|
* commit f71f17b63c6b3ddb7506000cbc422e8f1b173958
  Author: mtanti 
  Date:   Sat May 30 10:46:17 2020 +0200

      added the word 'updated' to readme

      readme.md was updated so that it says that it is my updated readme file.

You can see how when we commit the changes in mydevbranch, we'll have a fork in the timeline from the straight line that we currently have. Let's see what happens then.

Now that we're ready from the urgent request we can go back to the normal development branch:

git checkout mydevbranch

and pop back the changes we stashed:

git stash pop

Note that "git stash list" shows what is in the stash. We were working on src/a.txt where we were adding a new line to the file:

line 1
line 2 changed
line 3
line 4


Now let's imagine that we finished the changes we were making and can commit them:

git add .
git commit

Now a.txt is supposed to have two changes: the new line and each line starting with a capital letter. Each of these changes are on a different branch. Let's start by making the development branch complete by merging the changes we applied to the master to the development branch (note that we're reversing the direction of the merge now because we want the development branch to be up to date).

git merge master

Auto-merging src/a.txt
CONFLICT (content): Merge conflict in src/a.txt
Automatic merge failed; fix conflicts and then commit the result.

This is when things start getting hairy as you'll need to manually fix your conflicting changes. The status will tell us which files need to be fixed:

On branch mydevbranch
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Changes to be committed:

        modified:   src/b.txt

Unmerged paths:
  (use "git add ..." to mark resolution)

        both modified:   src/a.txt

This saying that during merging, src/b.txt was updated with no conflicts but src/a.txt requires manual intervention. If we open src/a.txt we'll see the following:

<<<<<<< HEAD
line 1
line 2 changed
line 3
line 4
=======
Line 1
Line 2 changed
Line 3
>>>>>>> master


The file has been modified by git to show the conflicting changes. 7 arrows and equals signs are used to highlight sections of changes which need to be resolved. Note that all the lines have been changed here to the section is the whole file. Now we can either fix the file directly or use git's "git mergetool" to help us by showing all the changes. In this case we can modify the file directly:

Line 1
Line 2 changed
Line 3
Line 4


Make sure to remove all the arrows and equals signs. Status will now output:

All conflicts fixed but you are still merging.
  (use "git commit" to conclude merge)

Changes to be committed:

        modified:   src/a.txt
        modified:   src/b.txt

Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

        modified:   src/a.txt

It's now saying that conflicts were fixed. All we need to do is add the fixed file and continue the merge.

git add src/a.txt
git merge --continue

At the end of the merge, you will be asked to enter a commit message in order to automatically commit the merge change. Git automatically puts the message "Merge branch 'master' into mydevbranch", which is good enough.

The log now shows the new timeline:

*   commit 49abf32812ec7dbeeab792729098a61fd3446a45 (HEAD -> mydevbranch)
|\  Merge: b6cc3c8 5dcd492
| | Author: mtanti 
| | Date:   Sun May 31 11:36:50 2020 +0200
| |
| |     Merge branch 'master' into mydevbranch
| |
| * commit 5dcd492113d3942550a58efdc7b90e15bd36d537 (myurgentbranch, master)
| | Author: mtanti 
| | Date:   Sun May 31 11:06:39 2020 +0200
| |
| |     capitalised each line in source files
| |
* | commit b6cc3c887533a995e749589b6cdbfaaad530b03e
|/  Author: mtanti 
|   Date:   Sun May 31 11:17:55 2020 +0200
|
|       added new line to a.txt
|
* commit f962efb409e4f08f94d717dec866519bc2848e8f
| Author: mtanti 
| Date:   Sun May 31 10:30:42 2020 +0200
|
|     added a new line to b.txt
|
* commit 9bc4488ac847bceccb746eeafb1a8c239de350f2
| Author: mtanti 
| Date:   Sun May 31 10:24:23 2020 +0200
|
|     changed line in a.txt
|
* commit 59cfc3f057bf1f19038ab15c4357d97bc84ac30e
| Author: mtanti 
| Date:   Sat May 30 11:17:14 2020 +0200
|
|     added source files
|
* commit f71f17b63c6b3ddb7506000cbc422e8f1b173958
  Author: mtanti 
  Date:   Sat May 30 10:46:17 2020 +0200

      added the word 'updated' to readme

      readme.md was updated so that it says that it is my updated readme file.

We can now continue modifying our development branch with anything left to add in the new version and then switch to master and merge, which will be a fast forward since we have resolved all conflicts already. Note that if you see the log before merging, you will not see the development branch since it is not in the master's timeline. You can see all timelines by entering "git log --graph --all".

git checkout master
git merge mydevbranch

If you want to delete the urgent branch, just enter "git branch --delete myurgentbranch".

This concludes our quick tutorial. I didn't mention anything about remote repositories and pushing and pulling to and from the repositories but basically if you setup a github account, you can keep a backup online for multiple developers to work together, pushing the local repository to the online one and pulling the online repository when it was changed by someone else. The online repository is referred to as 'origin' in git.

No comments:

Post a Comment