Introduction to GIT
gitComprehensive introduction to the most popular version control system
Version control systems #
Let us take a brief look at what version control systems are and why they are necessary part of the software development workflow.
Imagine a software code base with thousands of files, and multiple developers contributing code to it. It can be really difficult to manage such a large and complex repository of code without proper tools.
A version control system helps track and manage changes to these files, and enables versioning of source code. There are a number of version control systems like GIT, Mercurial, Apache Subversion (SVM), ClearCase, etc. In this article, we will be taking a look at one of the most popular choices — GIT, and how it is used by developers.
Why are version control systems required #
In order to understand what value git or any other system provides, first let’s take a look at what would happen if we did not use a version control system. We could simply use a cloud storage provider like google drive or dropbox to store and share all of the code files.
With no system in place to track changes to files, it would be impossible to maintain a history of who changed which part of which file - making it difficult for developers to collaborate on a single project. It would also be tedious to manage different versions of the code base — we could possibly create separate copies of different versions of the entire code and store them in folders, but we will not be able to find differences between various versions and unable to figure out what features went into each version, etc.
There would be no way to recover specific revisions of each file. For example, if some changes went in 10 days ago, and we realize that there is a bug in that code, we would want to track down the exact changes that were made so as to either revert the change until we can figure out a solution or to fix the issue right away. And we would also want to know who made the change since that person would have a better understanding of how the code works, and the same person could help us fix the issue.
Another point of consideration is that if multiple people are working on the same few files at the same time, then there are bound to be tons of problems when trying to merge (merge conflicts) those files together to make the whole piece of code work together. All these issues would lead to chaos and make it near impossible for developers to collaborate with each other. Version control systems solve almost all of these problems, and help developers manage their code base while still maintaining their sanity.
To summarize, version control systems are useful for things like
- Recovering specific revisions of each file
- Maintaining a history of who made what change to each file and when
- Manage versions of a product and gracefully manage product features
- Provide a reliable backup of the codebase
- Reverting back in case of issues or bugs
- Collaboration between multiple developers and teams
What is GIT #
Git was created in 2005 by Linus Torvalds for development of the Linux kernel. It is a free and open source version control software application used to manage and track changes to a set of files. Usually, it is accessed through the terminal, although there are graphical applications available that can replicate some of the functionality with a simple UI. ex: sublime merge or github desktop
What is a terminal: An application on your computer which allows text based access to the operating system (command line on windows). You can install applications, and interact with applications through the terminal.
Installing and Initializing GIT #
Installing #
Installing git is straightforward — just follow the steps on the downloads page https://git-scm.com/downloads. Although most macs and linux installations come with git pre-installed, you might have to install it on a windows machine. You can use either the windows command line or git bash to access git based on your configuration. You can create a git repository out of any folder. In fact git is also used for version controlling documentation and other metadata. This guide uses MacOS for the demo, but the git commands work the same in Windows and Linux.
Initializing #
The command git init is used to initialize an empty git repository in a particular folder.
What happens on running this command? — git creates a hidden folder called .git
in the directory where you ran the command. This folder is where git stores all the metadata regarding your files, and the changes made to those files, along with other configuration data.
As you can see in the image above, running git init initialized an empty git repository and created a hidden folder named .git — we don’t need to do anything with this hidden folder, but this is used by git internally to manage data for your repository (you can view hidden files and folders using ls -a command on linux based systems).
So if you wanted to remove the git repository from this folder, you can simply delete the .git
folder. This will not affect the current state of other files in your folder, but it will remove all the changes that git was tracking along with git related metadata.
Using GIT #
Checking status #
Once git is initiallized in a folder, the git status
command can be used to check the status of the repository.
Check the status of your repository using the command: git status
Running git status in the empty git repository tells us four things
- On master branch — We will take a look at branches later, but for now keep in mind that git creates a default branch called master (or main) when you initialize a repository.
- There are no commits yet — This means that no changes have been committed to the repository.
- Nothing to commit — There are no changes to any files in this folder that can be committed.
- Create/Copy files and use git add to track — Files can be added to this folder and the command
git add
can be used to track those files.
Tracking #
When a new repository was initialized, git told us that we can add files to the folder and use the command git add to track those files. Lets take a look at how this command works.
Let’s say you are developing a simple website and created a new index file called index.html
with this basic html template.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Learning Git</title>
</head>
<body>
<div>Hello World!</div>
</body>
</html>
After saving this file to the git-example folder, we can check the status.
Git is now saying there is a new untracked file called index.html
It still says nothing is added to commit, but we can add untracked files using git add
. When a new file is created and saved, git does not proactively do anything about it unless we specifically tell it to do something. Let us add this file to git and explore what that means. The documentation says — This command updates the index using the current content found in the working tree, to prepare the content staged for the next commit.
So the command git add
tells git to track the file, if it is not being tracked and also update the index (git’s index — not to be confused with the index.html file) with the changes in that file. So essentially it stages the file for commit.
Adding content from an individual file -
git add <filename>
Adding all unstaged content in the current folder -
git add .
Let us run the command and check out what happens
As expected, git says that there is a new file called index.html
and it is showing up as Changes to be committed
. We will take a look at staging and committing in the next sections.
Staging #
Git has a staging area — an intermediate place where we can edit and review the changes before committing them. When we run the command git add
the changes move into the staging area.
When comitted, all changes that are staged will be committed to the repository. Changes can be staged all at once, can be staged selectively for commit.
For the example website, let us add another file and stage it — A new file called about.html
which is saved to the website folder. Let us check the git status
before staging this new file.
Git says that there is a new file called about.html
which is untracked.
We can use git add to track this file, and move it to the staging area. But before we do that, its a good idea to take a look at where each file is in the git process — the index.html
file is in the staging area (since we added it earlier using git add). And the new file about.html
is in the working directory, since we have not told git to do anything to it.
Let us add this file to the staging area using git add .
Now git says there are 2 new files that are ready to be committed.
Great! Let us take a look at how to commit these files in the next section.
Committing #
Files can be added to the repository using the command git commit. This command commits the staged files to the git repository. This command accepts a message that is attached to the commit using the -m
option.
Let us commit the 2 files using the command git commit -m "first commit"
After committing the changes, git says there is nothing to commit and working tree is clean — meaning that there are no other changes in the directory that can be added and committed at this time.
Commit staged changes using
git commit -m "<message>"
Awesome! We made the fist commit. But how do we take a look at the commit logs? What commits went in before and who made them?
Checking the logs #
History of previous commits can be accessed using the command git log
This command shows the commit history, when they were made and who made them.
Let us add another file, and commit it and check the logs
View the logs with command
git log
The logs show all the commits — who made them and when. This is specially useful when working in teams and multiple developers are contributing code to the same repository.
Awesome! We have successfully created a git repository, added files to it, committed those changes to the repository and checked the logs.
GitHub #
Now that we have some code committed to a git repository, we can now host it online and collaborate with other developers. This is where something like Github comes into the picture.
Github is a platform for hosting and reviewing code and project management. It provides an online space for hosting git repositories. There are other companies that provide similar functionality as well like BitBucket and GitLab. Future articles will cover this part of the git workflow.
Summary #
This article explored version control systems and how they are crucial to the software development workflow, what is git
and how it is used. We looked at the basic commands like git init
, git status
, git add
, git commit
and git log
.
Hopefully this provides a useful introduction to git and the development workflow, and makes you comfortable using git to manage your code. Stay tuned for upcoming articles that will explore advanced git techniques.