[CP2K:2895] How to use git as a client for the CVS server

Ondrej Marsalek ondrej.... at gmail.com
Sun Oct 31 14:10:34 CET 2010


Hi Toon,

thanks for this, I think it is useful to have access to CP2K through
something more flexible than CVS. Some time ago, I have tried, with
the same motivation, to set up and keep up-to-date a bazaar repository
of CP2K at Launchpad. Basically the same thing, just with bazaar and
Launchpad in place of git and github. See the conversation that
followed here:

http://groups.google.com/group/cp2k/browse_thread/thread/ba88294ae132314e/a197ced409bf839c

Judging by the relatively long silence, your take on this matter does
a better job and is not taken as "disrespectful to the work of others"
or a "split project". I am glad that this is the case and am looking
forward to using your git mirror of CP2K at github.com.

Best regards,
Ondrej


On Thu, Oct 28, 2010 at 18:10, Toon Verstraelen
<Toon.Ver... at ugent.be> wrote:
>
> Hi all,
>
> I've explained the subject to a few colleagues so far, and it may be of
> interest for others. This (long) mail is also an attempt to avoid explaining
> the same thing over and over again. Have fun.
>
> cheers,
>
> Toon
>
>
>
>
> 1. Summary
> ==========
>
>
> Disclaimer
> ----------
>
> This is is not an attempt to convince the entire CP2K community to ditch CVS
> and use git instead. This mail just explains how to avoid using CVS, and is
> written for those who share the believe that CVS is a crappy program to keep
> track of a source code history. I've included a few benchmarks as a light
> form advocacy, but in the end it is all up to you.
>
>
> Problem
> -------
>
> CVS is slow for large projects like CP2K and has little and impractical
> features to to play with branches and experimental versions.
>
>
> Solution
> --------
>
> Maintain a local git mirror of the CVS history. This is also a convenient
> place to keep tracks of your own patches before they go into the official
> CVS. Once you have a set of patches for a stable implementation of a new
> feature, use 'git rebase' to apply these patches to the latests revision
> from CVS. Then send them to someone who has write access to the CVS thing.
> (Preferentially that person can also do some reviewing.)
>
> There are many alternatives to CVS. Git is the fastest and has good support
> for branches and distributed development. It is designed to manage large
> projects such as the Linux kernel.
>
>
> Links
> -----
>
> Git homepage: http://git-scm.com/
> Git web interface: https://git.wiki.kernel.org/index.php/Gitweb
> Bare-bones GUI: http://www.kernel.org/pub/software/scm/git/docs/gitk.html
> Free public git hosting: http://github.com/ http://gitorious.org/
>
>
>
> 2. Details
> ==========
>
> I'll discuss the cvs-to-git conversion in the end. For now, we'll just start
> from my public cp2k mirror on github.com. I assume you know how to install
> git and gitk on your OS and I also assume that OS is a unix.
>
>
> Cloning a repository
> --------------------
>
> Download the entire history of the CP2K source code:
>
> git clone git://github.com/tovrstra/cp2k.git
>
> Some notes:
> - A directory cp2k is created with the latest master branch.
> - You also get the history with all patches. This is stored in cp2k/.git/.
>
>
> Fire up gitk
> ------------
>
> cd cp2k
> gitk &
>
> Some notes:
> - The graphical interface is very light and suitable for remote X.
> - With the '--all' options one sees all branches.
> - There are some menu items in gitk to prepare commits etc, but I recommend
> using the conventional command-line interface of git instead.
>
>
> Switch branches
> ---------------
>
> My cp2k repository also has a branch where some CVS-specific parts in the
> Makefile are replaced by their git counterparts. One switches the working
> directory to this branch as follows:
>
> git checkout cp2k-git
>
> Compilation is done as usual. One gets an overview of all branches as
> follows:
>
> git branch
>
>
> Include branch name in the shell prompt
> ---------------------------------------
>
> Playing with different branches quickly becomes confusing. One may use the
> following PS1 variable to include the branch name in the shell prompt.
>
> GITPS1='$(__git_ps1 ":%s")'
> export PS1="\u@\h \w${GITPS1}> "
>
> or with fancy colors (designed for a dark background)
>
> GITPS1='$(__git_ps1 ":%s")'
> GREEN="\[\033[1;32m\]"
> BLUE="\[\033[1;34m\]"
> YELLOW="\[\033[1;33m\]"
> RS="\[\033[00m\]"
> export PS1="${GREEN}\u@\h${RS} ${BLUE}\w${RS}${YELLOW}${GITPS1}${BLUE}>${RS}
> "
>
> The prompt will look like this:
>
> toon at molmod49 ~/cp2k:cp2k-git>
>
> This is not mandatory, but it makes working with branches a lot easier.
>
>
> Configure git
> -------------
>
> Add these sections to ~/.gitconfig (with your own personal info).
>
> [user]
>    name = Toon Verstraelen
>    email = Toon.Ver... at UGent.be
>
> [color]
>    diff = always
>    status = always
>    interactive = always
>    branch = always
>
> This is not mandatory, but again very convenient.
>
>
> Create a new branch to store your patches
> -----------------------------------------
>
> Instead of adding patches to the cp2k-git branch, it is safer to add them to
> a new private branch. All commits remain local anyway untill you run 'git
> push some-repo'. The new branch initially only exists in your local copy of
> the cp2k repository.
>
> toon at molmod49 ~/cp2k:cp2k-git> git branch myhack
> toon at molmod49 ~/cp2k:cp2k-git> git checkout myhack
> toon at molmod49 ~/cp2k:myhack>
>
> This can also be done in one step.
>
> toon at molmod49 ~/cp2k:cp2k-git> git checkout -b myhack
> toon at molmod49 ~/cp2k:myhack>
>
>
> Add a patch
> -----------
>
> The example here is just a fix for a trivial typo in the input
> documentation.
> On line 232 of the file input_cp2k_mm.F there is a white-space missing in
> the end of the string. It is also convenient to wrap lines at 80 characters
> because this is the default width of a text terminal. After making the
> changes, they can be reviewed:
>
> toon at molmod49 ~/cp2k:myhack> git diff
> diff --git a/src/input_cp2k_mm.F b/src/input_cp2k_mm.F
> index 26b37f5..200e000 100644
> --- a/src/input_cp2k_mm.F
> +++ b/src/input_cp2k_mm.F
> @@ -320,8 +320,8 @@ CONTAINS
>        CALL keyword_release(keyword,error=error)
>        !Universal scattering potential at very short distances
>        CALL keyword_create(keyword, name="ZBL_SCATTERING",&
> -            description="A short range repulsive potential is added, to
> simulate"//&
> -            "collisions and scattering.",&
> +            description="A short range repulsive potential is added, to
> "//&
> +            "simulate collisions and scattering.",&
>             usage="ZBL_SCATTERING
> T",default_l_val=.FALSE.,lone_keyword_l_val=.TRUE.,&
>             error=error)
>        CALL section_add_keyword(section,keyword,error=error)
>
> The minus-lines are colored in red and the plus-lines in green. After
> testing the patch -- a simple compilation is sufficient for this -- one can
> commit the changes to the repository. This is typically done in two stages.
> One first adds the changes to an intermediate stage, called the index. Once
> the index is OK, it is actually committed. This two-step approach is
> convenient when working with more complex patches.
>
> Add the file to the index:
>
> toon at molmod49 ~/cp2k:myhack> git add src/input_cp2k_mm.F
>
> Commit it:
>
> toon at molmod49 ~/cp2k:myhack> git commit
>
> An editor will appear in which writes a few notes. The first line is a short
> summary, optionally followed by an empty line and a longer discussion.
>
> The two steps can be done in one command if there are only modifications to
> existing files:
>
> toon at molmod49 ~/cp2k:myhack> git commit src/input_cp2k_mm.F
>
> or
>
> toon at molmod49 ~/cp2k:myhack> git commit -a
>
> If there is only one line in the commit message, it can be given on the
> command line:
>
> toon at molmod49 ~/cp2k:myhack> git commit -a -m 'Fixed typo'
>
> Keep repeating this with all the things you want to change. Keep commits as
> small as possible and test them. One can look back at the commit history
> with gitk or 'git log'.
>
>
> Rebase patches to the latest master
> -----------------------------------
>
> In practice it takes some time before a set of patches is finished and often
> the CVS master branches evolves in the meantime. I occasionally synchronize
> my git repo and apply the cp2kgit patch on top. It is recommended to apply
> your patches to the latest version too. This is typically a painful job, but
> with 'git rebase' it becomes trivial.
>
> First update your local mirror of the repository:
>
> toon at molmod49 ~/cp2k:myhack> git checkout cp2k-git
> toon at molmod49 ~/cp2k:cp2k-git> git pull origin cp2k-git:cp2k-git
> (some progress output)
>
> Some notes:
> - origin refers to the git hub repository. It is the default shorthand for
> the repository that was used with 'git clone'.
> - cp2k-git:cp2k-git is optional. It means that the remote cp2k-git branch is
> used to update the local cp2k-git branch.
>
> Then rebase your patches:
>
> toon at molmod49 ~/cp2k:cp2k-git> git checkout myhack
> toon at molmod49 ~/cp2k:myhack> git rebase cp2k-git
>
> In this case the patch is so small that you will probably not have to
> intervene manually, unless somebody changed exactly the same two lines or
> the the six surrounding lines. In more complex cases 'git rebase' will stop
> when it encounters a doubtful situation. Some instructions are given such
> that you can easily modify the problematic patch and continue the rebase
> process.
>
>
> Sending patches by email
> ------------------------
>
> Once a set of patches is ready, they can be prepared for email as follows:
>
> toon at molmod49 ~/cp2k:myhack> git format-patch -1
> 0001-Fixed-typo.patch
>
> The -1 option indicates the number of patch files to be created. Put these
> files an a compressed archive and send the archive to somebody with CVS
> write
> access. They'll know what to do with it.
>
>
> A few benchmarks
> ----------------
>
> diffs
> ^^^^^
>
> The diff is executed after making the changes in the above example.
>
> time git diff &> /dev/null
>
> real    0m0.017s
> user    0m0.000s
> sys     0m0.010s
>
> time cvs diff &> /dev/null
>
> real    0m1.484s
> user    0m0.040s
> sys     0m0.040s
>
> This is just a small patch, but with large patches the benchmarks become
> more dramatic. Because most CP2K developers dig on speed, I guess two order
> of magnitude will be appreciated. Similar speedups can be found with other
> commands that git and CVS have in common.
>
> Cloning
> ^^^^^^^
>
> This is a special benchmark. A complete repository clone is not really
> supported in CVS. Therefore I compare a 'git clone' with the a CVS checkout
> instead. The latter is a much lighter operation.
>
> time cvs -z3 -d:pserver:anonymous at cvs.cp2k.berlios.de:/cvsroot/cp2k co cp2k
> (lots of output)
> real    0m13.987s
> user    0m2.040s
> sys     0m1.480s
>
> time git clone git://github.com/tovrstra/cp2k.git
> (some output)
> real    0m20.315s
> user    0m8.910s
> sys     0m1.240s
>
> The comparison is a bit difficult as it is mainly determined by the hosting
> server. Note that git downloads all revision, while CVS only gives you the
> latest version. This is just to show that there is no practical problem with
> cloning entire repositories in git. The storage is also remarkably compact
>
> du -sh cp2k/.git/
> 59M
>
> This is less than the tar file with the CVSROOT.
>
>
> CVS to git migration
> --------------------
>
> This does not always go smooth. It turns out that CVS does not keep an
> accurate history of all patches and metadata, and that it may be difficult
> to convert all this information to a revision system with a proper storage
> backend. The tigris community has developed a complex batch script that
> tries to make the best out of it. More info can be found here:
>
> http://cvs2svn.tigris.org/
>
> They also have a cvs2git script. In case of CP2K it is used as follows:
>
> wget http://download.berlios.de/cvstarballs/cp2k-cvsroot.tar.gz
> tar -xvzf cp2k-cvsroot.tar.gz
> mkdir cvs
> mkdir git2
> cd git2
> cvs2git ../cvs/cp2k --blobfile=cp2kblob --dumpfile=cp2kdump --username=fubar
> mkdir cp2k
> cd cp2k
> git init
> cat ../cp2kblob ../cp2kdump | git fast-import
>
>
> CVS to git migration bis
> ------------------------
>
> One can also use the 'git cvsimport' script. It is somewhat simpler and can
> also update a git mirror with the latest changes in a CVS repository. The
> first time one has to do a full conversion:
>
> wget http://download.berlios.de/cvstarballs/cp2k-cvsroot.tar.gz
> tar -xvzf cp2k-cvsroot.tar.gz
> mkdir cvs
> mkdir git
> mv cp2k cvs
> cd git
> git cvsimport -d ${PWD}/../cvs/cp2k cp2k
>
> Later on, one can do updates:
>
> git cvsimport -d:pserver:anonymous at cvs.cp2k.berlios.de:/cvsroot/cp2k cp2k
>
> The update step may rarely fail. It is recommended to update in a dedicated
> working directory and to use 'git push' to send the new patches to a
> separate mirror. There seems to be little general interest for this update
> feature. Most people convert just once and never look back.
>
>
> A downside of git
> -----------------
>
> Git uses sha hashes to label commits (and other things). This has many
> technical advantages, but it is not as intuitive as a the simple version
> numbers that CVS or SVN use. Good one-line summaries and proper use of 'git
> tag' solve this issue mostly.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "cp2k" group.
> To post to this group, send email to cp... at googlegroups.com.
> To unsubscribe from this group, send email to
> cp2k+uns... at googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/cp2k?hl=en.
>
>



More information about the CP2K-user mailing list