There are many posts about the difference on git’s rebase and merge commands, here I’d like to link two of them that I think did a great job on the explanation.

1. Rebase vs Merge in Git
2. Avoid Merge commit in Git, this one shows very detail about the difference on these two options, and it is the exact problem I was facing.

I’d also like to quote the work pattern that is described in the above 1st link. (The result is basically put all your branch work/commits ahead of your base branch, for example master, and of course without adding those meaningless merge commits that you would have if you use the git merge.)

  1. Create new branch B from existing branch A
  2. Add/commit changes on branch B
  3. Rebase updates from branch A
  4. Merge changes from branch B onto branch A

I typically use the ‘master’ branch as the base branch, which means that I only sync this branch from the remote repository, and only merge commits into this branch when I think the bug/feature is completed. Once I’ve done the merge in master, I’ll do the commit and push, so the code in master branch’s staging repository is very short. However, I’ve done a lot of pull request from remote repository into the master branch, as I need to sync other people’s work, and then I’ll do the rebase on the topic/bugfix branch that I am working on to sync the work without adding the merge commit, and put my commit in front of others commits that are already in the repo.

One caution with regard to the rebase operation is: (Again quote from the 2nd link that I pointed)

If you’ve shared the branch with anyone else, or are pushing it to a clone of the repository, do not rebase, but use merge instead. From the man page:

When you rebase a branch, you are changing its history in a way that will cause problems for anyone who already has a copy of the branch in their repository and tries to pull updates from you. You should understand the implications of using git rebase on a repository that you share.

I knew it has some good Git Cheat Sheets on the internet. So I won’t try to create yet another Git Cheat Sheet. Instead, I’ll focus on some commands for the history and file difference, as I often use them to check what has been updated on the commits, and what files been changed in there etc. Usually I went to GitHub to check out the previous commits, as it did a very good job on showing the updated files, difference on each file etc. However, at some time, my home internet is very slow, so it could be painful to go to the Github to check on those commits. (It is nothing wrong in Github, it is just in my particular case at some time due to bad network).

Git is a DVCS, so it means that we have all histories, logs in my local machine, the remote one should be served as a backup/collaborate purpose. So it means for checking the commits/difference, it should be accomplished without connecting to internet.  Here I’ll try to enumerate some options/commands that I ran for checking those updates/commits. If you have any other good usages/commands that I left out. please feel free to make a comment.

1. Show all of previous commits (result only shows commit summary):

git  log

2. Show the latest n commits:

git  log  -n

3. Show the statistics (like updated file list) in previous commits:

git log –stat

4. Show previous commits with patches:

git log -p

5. Show a specific commit with commit hash id, you just need to have enough first couple letters to make it unique:
Add –stats (if you want to show the statistics) and -p (if you want to show the detailed diff) options.

git show 36b314c (first couple letters of id)

6. Show commits/histories to a specific file (-p, -n apply here as well, but should be in front of filepath):

git log — filepath

7. Show a diff on a specific file with a specific commit:

git show commitId (like 36b314c for example) — filepath

9. Show difference between two commits:

git diff firstCommitId secondCommitId

10. Show difference between two commits for a specific file.

git diff firstCommitId secondCommitId — filepath

I am starting to work on one task in ODE, titled cleanup JPA impl (https://issues.apache.org/jira/browse/ODE-704), basically it is a quite large refactoring on DAO layer of ODE project. Because I am not a ODE committer, It is not an easy job to get this task done.

After talked with Rafal Rusin on the ode IRC channel, he suggested that I create a git project in github, which clones the ode git repo, and then put my jpa refactoring experiment branch in github, once my code has finished, and passed all the tests, they can merge my branch into the ode trunk.

I’ve heard of Git for a while, but didn’t get a chance to use it, as currently I am still using the Subversion as SCM repository. One thing that I learnt from history is that try to avoid using branch in SVN or CVS as much as possible, it is really a headache for merging branch back to trunk, so this is very inconvenient for you to try some new feature, or some experiment codes.

Create a project at Github is very easy, if you have problems, GitHub’s help is your friend. I have to say that Github did an awesome job on project hosting, it makes you very easy to browse your code, the diff message between version etc.

Watching the Linus’ Git talk on Google Tech conf, very interesting, totally agreed that one pain with centralized repository like CVS or SVN, is that you need to get the commit access for your contribution (I am not saying a small fix, or patch, I meant some large task, or feature etc, which would require multiple patches, and might take one or two weeks), like the case that I am hitting now. Alternatively, if you are hosting code repository by using Git, I can clone it from the url, and pull the changes into my local workspace to make it up-to-date, and then push it into some other repository, once I’ve finished my task, I will send you my code repository, and then you can take a look at those code to decide if you want to accept my code or not. In this case, I don’t need to have a commit permission in advance. Also merge in Git is very easy, it makes collaboration really easy.

So, I would strongly recommend that every open source project should embrace the Git as scm tool, lets forget about Subversion, CVS. Lets embrace the branch. ;-)

couple resources that could help you get started with Git.
1. Git website
2. GitHub webiste
3. Learn Git website (Strongly recommend for learning)

See Johnny’s this blog entry. It is very useful if you don’t use the GUI such as tortoise in windows.

one more note on this is that you can export SVN_EDITOR=your favorite text editor, and then when you run:
svn ci
it opens up your specified the editor to input your commit comment.

我还记得我刚到achievo的时候,第一次是做Log方法的重构,虽然修改很少,但是涉及到的文件多… 而且当时是用VSS,要一个文件夹一个文件夹提交.. 记得当时好像是不能基于整个project的.
因为我以前是用CVS,所以刚转到VSS的时候,当我checkout一个文件的时候,我以为他顺带获取最新的文件.结果不是(不过这个好像是可以配置的.).. 所以我后来提交的时候,好像覆盖了几个文件,就因为这几个文件,我那天加班,因为我要挨个文件查过去… 

我记得在Achievo的2年里,一般提交文件,都是一个文件一个文件提交,因为每个文件要写comment. 这样弄的很麻烦,特别是在1年前转到SVN下,也这么弄,弄得非常麻烦。如果做个大型的重构,简直来说,就很麻烦. 而且,还有个很重要的问题,那就是,如果你发现你这次的提交破坏了build,你可以revert你这次的提交..如果你分成很多次提交的话,就很麻烦,如果次数一多,你就发现根本不可能…所以,无形中失去了一个很重要的功能.

观察Apache下面的CXF项目的开发的时候,提交代码都是以一个项目来提交,comment里面呢,就写提交或者改动的目的或者原因,但没必要写你修改了哪个方法.(这些SVN会告诉你的)…

来新公司之后,我觉得作为软件开发,有几个东西是一定要的.
1. CruiseControl (或者其他的集成工具)
2. Subversion
3. JIRA or Bugzilla etc
4. Maillist
5. WIKI
这里我觉得subversion和cruisecontrol很基本的,一般都会有,JIRA or bugzilla也一般会有。但我想强调的是maillist and WIKI,我觉得这两者给我的感觉特别好,有什么问题在maillist上面问…