Page 1 of 18
Introduction
This document provides a comprehensive overview of Version Control Systems (VCS), particularly focusing on Git and Github. It covers the types of VCS, their architectures, core git concepts, basic and advanced commands, branching, merging, remotes, and related tooling, structured for technical clarity and reference.
Git & Github
Version Control System
- Definition:
It is a system that records changes to a file or set of files over time so that to recall specific versions later.
→ It is a system that records changes to a file or set of files over time so that you can recall specific versions later.
Types of VCS:
I. Local VCS:
flowchart TD
File -->|batch1| Version1
File -->|batch2| Version2
File -->|batch3| Version3
Version1 --> Version2 --> Version3
Version3 --> VersionDB["Version DB"]
II. Centralized VCS:
To work with other developers on other systems.
flowchart TD
ComputerA["Computer A (File)"] --> CentralVCSServer["Central VCS Server"]
ComputerB["Computer B (File)"] --> CentralVCSServer
CentralVCSServer --> VersionDB["Version DB"]
VersionDB --> Version1
VersionDB --> Version2
VersionDB --> Version3
-
Problems:
- Pay/Steep Learning Curve
Pay/Steep→ Steep - Server down
- Server corrupt
- Pay/Steep Learning Curve
-
Clients check out the latest snapshot of the file.
Page 2 of 18
III. Distributed VCS:
- Clients checkout (mirror) the repository, including its full history.
flowchart TD
Server["Server Computer\nVersion DB"] -->|File| ComputerA["Computer A"]
Server["Server Computer\nVersion DB"] -->|File| ComputerB["Computer B"]
ComputerA --> VersionDB_A["Version DB"]
ComputerB --> VersionDB_B["Version DB"]
VersionDB_A --> V1
VersionDB_A --> V2
VersionDB_A --> V3
VersionDB_B --> N3
VersionDB_B --> V2
VersionDB_B --> V1
VersionDB_A <--> VersionDB_B
Git
- Git is a DVCS which stores data like a series of snapshots of a miniature filesystem (FS).
flowchart LR
V1["V1"] --> V2["V2"] --> N3["N3"] --> V4["V4"] --> V5["V5"]
FileA1["File A"] --> A1
FileA2["File A"] --> A2
FileC1["File C"] --> A1
FileC2["File C"] --> A2
FileC3["File C"] --> A3
%% Representation of Delta-based VCS
- Delta Based VCS
Page 3 of 18
Snapshots of Data
flowchart LR
V1 --> V2 --> V3 --> V4 --> V5
V1 -->|FileA, FileB1, FileC| V2
V2 -->|FileA1, FileB, FileC1| V3
V3 -->|FileA1, FileB1, FileC2| V4
V4 -->|FileA2, B1, C3| V5
V5 -->|A3, B2, C3|
- Every operation in git is local.
- Git stores data by SHA; everything is checksummed & referenced by that checksum.
- So, any changes/corruption cannot go without git noticing it.
- Everything is undoable.
Three States
- Committed:
Everything is committed, not me :)→ Data is safely stored in local database. - Modified: Changed file but not committed.
- Staged: Marked a modified file to go to next commit snapshot.
Workflow Diagram
flowchart LR
WD["Working Directory"] --> SA["Staging Area"] --> GR["git repo"]
WD -->|checkout the project| SA
SA -->|stage files| GR
GR -->|commit| SA
Page 4 of 18
Git Objects and Areas
- git repo is where git stores metadata & object DB.
- Working tree is a single checkout of one version of the project – pulls all of compressed DB & places on disk.
- Staging area is a file in git repo storing information about what will go into next commit.
File States Table
| Untracked | Tracked | |
|---|---|---|
| Unmodified | ||
| Action | Add file | Edit file |
| Action | Remove the file |
flowchart LR
Untracked -->|add file| Unmodified
Unmodified -->|edit file| Modified
Modified -->|stage file| Staged
Staged -->|commit| Unmodified
Unmodified -->|remove file| Untracked
Common Git Commands
git init
git add
git status
git diff # (Working Directory ↔ Staged Area)
git diff --staged # (Staged Area ↔ Last Commit)
git commit -m "message for commit"
git commit --amend -m "message2" # Tracked Area Only
git rm --cached [file] # for directory for changes also (forced)
Page 5 of 18
More Commands
git rm [file]
# removes file from staged area as well as working directory
git log
# whole bunch of methods that only need to be googled. Not much required.
Undoing Things
I. Amend Last Commit
git commit -m "last commit"
git add forgotten_file
git commit --amend
II. Unstaging a file
git reset HEAD <file_name>
III. Unmodifying a modified file
git checkout -- <file_name>
Page 6 of 18
Remotes
- Versions of the project hosted elsewhere.
git remote -v
git remote add <remote-name> <remote-url>
git pull <remote-name>
git push <remote> <branch>
git remote rename <old> <new>
git remote remove <branch-name>
- Tagging: Not a major thing to study.
Aliases
git config --system # System
git config --global # Current user
git config # Current working repo
git config alias.ci 'commit'
Page 7 of 18
Killer Branching
What a Commit Looks Like
classDiagram
class Commit1 {
id: 98ca9
tree: 92ec2
author
message
}
class Commit2 {
id: 92ec2
tree: 5b1d3
}
class Readme {
id: 5b1d3
}
class License {
id: c0a0a
}
Commit1 --> Commit2
Commit2 --> Readme
Commit2 --> License
A Commit Tree
flowchart LR
C1["Commit 1 (98ca9)"] --> C2["Commit 2 (34ac2)"]
C2 --> C3["Commit 3 (6302e)"]
C1 --> S1["Snap 1"]
C2 --> S2["Snap 2"]
C3 --> S3["Snap 3"]
- Commits & its parents.
- With git branch,
Branching→ branch is a 40-byte pointer.
git branch newBranch
# creates new pointer to the same commit you're currently on.
Page 8 of 18
- HEAD pointer points to the pointer we are currently on.
git checkout newBranch
git commit
# Head points to newBranch
flowchart LR
C0 <--> C1 <--> C2 <--> C3
C3 -->|new| HEAD
git checkout -b <newBranchName>
# Creates one & checks out
git branch -d <branchName>
# Deletes branch
flowchart LR
C0 <--> C1 <--> C2 <--> C3
C2 --> C4
C3 --> C5
C5 -->|special commit (merge)| MergeCommit
Head --> master
Head --> feature
git checkout <branch> # branch to merge into
git merge <branch> # branch to merge
Page 9 of 18
Three-Way Merge
- Done via ICA – comparing contents.
- When some parts of a file are different (changed) in both branches, merge conflicts arise.
- Resolve them.
- Add them.
- git commit to end the process (conclude).
Branch Management
-
git branch -v
Shows all branches. -
git branch --merged/--no-merged
Shows branches that are merged/unmerged in current branch. -
git branch -d <branch-name>
Deletes branch.git branch -D <branch-name>
Deletes branch without merging.
Remote Branches
-
git remote -v
Get all remote references. -
git remote show <remoteName>
Show complete information about a branch.
Page 10 of 18
- Note: Remote references are those local references that you can't move (use); git makes them for us whenever we do network communication.
flowchart LR
C0 --> C1 --> C2
github["github.com/someproject"]
git clone <url>
flowchart LR
C0 --> C1 --> C2
branchRemote["(Remote)"]
branchLocal["(Local)"]
branchRemote --> MyComputer
- Now I can continue work on local master and remote master won't update unless I do network communication.
Example:
git fetch origin
flowchart LR
C0 --> C1 --> C2 --> C3
master["master"]
After fetch:
flowchart LR
C0 --> C1 --> C2 --> C3
C2 --> C4 --> C5
master["master"]
myComputer["My Computer"]
Page 11 of 18
git pull
git pull --rebase
git fetch + git rebase origin/master
git fetch + git merge origin/master
git push
pushes code to server→ Moves pointer of remote to current commit.
For any project, to avoid merge commits:
git pull --rebase
git push
Page 12 of 18
Notes:
Git knows the connection between master & origin/master.
This is called the "remote tracking" property of branches."master" branch is set to track "origin/master" by git clone automatically.
So, we can make any branch track remote branch by
git clone.
Another way is to:
git checkout -b <localBranch> origin/mastergit branch -u origin/foo- For an existing branch, set foo to track remote.
git branch -u origin/foo- If foo is currently checked.
Page 13 of 18
Git Push Requirements
-
git push origin master- Go to origin remote and push my master branch to remote's master.
-
git push origin <source>:<destination>- Go to origin remote and push my source branch to destination's remote branch.
- If destination doesn't exist, git will make one new branch on remote.
Git Fetch Requirements
-
git fetch origin foo- Updates origin/foo branch.
-
git fetch origin <source>:<destination>- Updates destination branch with remote's source notes.
- Note: Not a good practice.
If source is empty:
-
Weird things happen.
-
For push: Deletes remote branch.
-
For fetch: Creates new branch locally.
Page 14 of 18
Rebasing
(Merging ki kalen bhaiye, experiment)
flowchart LR
C0 --> C1 --> C2
C2 --> C3
C2 --> C4
C3 --> master
C4 --> experiment
Basic Merge
Other way is rebase:
git checkout <branchname> # which is gonna be on top
git rebase <branch2> # which is gonna be on bottom
flowchart LR
C0 --> C1 --> C2
C2 --> C3
C2 --> C4
master --> experiment
git checkout master
git merge experiment
flowchart LR
C0 --> C1 --> C2 --> C3 --> C4
master --> experiment
git rebase <bottom> <top>
Page 15 of 18
Git on Server
Step 1: Choose protocol
Step 2: Set up
Step 3: Go online (services, hosting git server)
Protocols
I. Local protocol:
- Server on another directory on same host.
- Used when some group has access to shared filesystem.
git clone path/to/repo
II. HTTP protocol:
- Smart HTTP (HTTPS)
- Dumb HTTP (HTTP)
- Helps in communicating without setting up SSH keys.
III. SSH protocol:
- Most used for self-hosted servers.
git clone ssh://[user@]server/project.git
git clone [user@]server:project.git
- Efficient, secure, but not support anonymous access.
IV. Git protocol:
- Set up a git-daemon-export-ok file.
- No authentication but fastest network protocol.
Page 16 of 18
Creating Git on Server
Step 1: Create a bare repo
git clone --bare project project.git # convention
Step 2: Set up a server with SSH access (use openssh-server)
scp -r project.git user@git.example.com:/srv/git/
- Save project to
/srv/git
Step 3: On client, run
git clone user@server.com:/srv/git/project.git
Later: (will do it when free)
Page 17 of 18
Git Tools
I. Head
- Head is the current position.
git checkout C0
# Head is now pointing to commit
git checkout master
# Head is under master
II. Relative Head
git checkout <commit-hash>
# Only use when it is needed (like bisect)
-
Move upwards by one commit (in history):
^or~num -
Also, HEAD is a relative revision.
III. Moving Branches Forcefully
git branch -f master C6
git branch -f bugfix HEAD^
Page 17 of 18
Introduction
These pages discuss advanced Git concepts, focusing on cherry-picking, interactive rebase, and Git internals such as object types, branches, references, and the fundamental structure of Git repositories.
III. Cherry-Pick (Seriously?)
git cherry-pick C1 C8- Puts
C1&C8on top of master andmerges master in→ merges into master
- Puts
IV. Rebase Interactive
git rebase -i HEAD~10
Page 18 of 18
GIT Internals
- Git is just a
key-value data source→ key-value data store
Object Types
- Commit: author, message, pointer to a tree of changes
- Tree: Pointer(s) to file names, content, other trees
- Blob: Data (source code, pictures, videos, etc.)
Tags and → & Branches
Pointers to commits (lightweight)
- Not full copies
- Allows a name for a commit
- Additional meta information
flowchart TD
HEAD --> branch
branch --> commit
commit --> tree
tree --> blob
- So, git is
database→ a database of references. - It is one giant directed acyclic graph.
Git Is
HEAD→ Referenceconfig,descriptionhooksindexinfoobjectsrefs- heads
- remote
- tags
- stash
References & Related Topics
- Pro Git Book
- Git Object Model
- Git Branching and Merging
- DAGs in Version Control Systems