Ian Bicking: a blog :: Git-as-sync, not source-control-as-deployment

{ 2012 02 14 }

Git-as-sync, not source-control-as-deployment

I don’t like systems that use git push for deployment (Heroku et al). Why? I do a lot of this:

$ git push deploy
... realize I forgot a domain name ...
$ git commit -m "fix domain name" -a ; git push deploy
... realize I didn't do something right with the database setup ...
$ git commit -m "configure database right" -a ; git push deploy
... dammit, I didn't fix it quite right ...
$ git commit -m "typo" -a ; git push deploy

And then maybe I’d actually like to keep my config out of my source control, or have a build process that I run locally, or any number of things. I’d like to be able to test deployment, but every deployment is a commit, and I like to commit tested work. I think I could use git rebase but I lack the discipline to undo my work so I can do it correctly. This is why I don’t do continuous commits.

There’s a whole different level of weirdness when you use GitHub Pages as you aren’t pushing to a deployment-specific remote, you are pushing to a deployment-specific branch.

So I’ve generally thought: git deployment is wrong.

Then I was talking to some other people at Mozilla and they mentioned that ops was using git for simply moving files around even though the stuff they were deploying was itself in Mercurial. They had a particular site with a very large number of files, and it was faster to use git than rsync (git has more metadata than rsync; rsync has to look at everything everytime you sync). And that all seemed very reasonable; git is a fine way to sync things.

But I kind of forgot about it all, and just swore to myself as I did too many trivial commits and wrote too many meaningless commit messages.

Still… it isn’t so hard to separate these concerns, is it? So I wrote up a quite small command called git-sync. The basic idea: copy the working directory to a new location (minus .git/), commit that, and push the result to your deployment remote. You can send modified and untracked files, and you can run a build script before committing and push the result of the build script, all without sullying your "real" source control. And you happen to have a nice history of deployments, which is also nice.

I’ve only used this a little bit, but I’ve enjoyed when I have used it, and it makes me feel much better/clearer about my actual commits. It’s really short right now, and probably gets some things entirely wrong (e.g., moving over untracked files). But it works well enough to be improved (winkwinknudgenudge).

So check it out: https://github.com/ianb/git-sync

Automatically generated list of related posts:

What PHP Deployment Gets Right With the recent talk on the blogosphere about deployment (and...
App Engine and Open Source This is about Google App Engine which probably everyone has...
Toward a new self-definition for open source This is roughly the speech I gave as a keynote...

11 Comments

Josh Gachnang says:

February 14, 2012 at 11:13 am

This script looks great. I’ll test it out later today.

On the deployment part, what about having multiple branches doing deployments? For example, do development in a dev branch, which deploys to a smaller testing server, then merge the testing into the master branch, which will deploy to the real server(s).
- Ian says:
  
  February 14, 2012 at 12:33 pm
  
  I’m not a fan of that either – what I keep in source control isn’t what I want to push to a server, and I don’t like using a highly async process (git and hooks etc) to transform what I keep in source control into what I can run. Plus you still have to rebase if you want a clean history.
  - Anonymous says:
    
    February 18, 2012 at 5:53 pm
    
    One way to see benefits of the deployment process is to reverse this statement:
    
    > what I keep in source control isn’t what I want to push to a server,
    
    and turn it into:
    
    > What is pushed to a server should be tracked [by source control]
    
    What you DO with that tracking can be separate from that concept.
Mark Nottingham says:

February 14, 2012 at 6:39 pm

Did you look at Unison?

http://www.cis.upenn.edu/~bcpierce/unison/
Simon Sapin says:

February 15, 2012 at 2:40 am

Hi,

I’m sure that git can be faster than rsync, but is this particular script really faster? It still uses rsync to do a local copy before using git.
- Jérôme Petazzoni says:
  
  February 29, 2012 at 11:19 am
  
  This particular script seems to be using rsync locally. In that case, it’s very fast (faster than git, cp, etc.; or rather, as fast as cp, except that IIRC it will skip copying files that already exist with the same size and timestamp).
  
  Git is faster than rsync only when working over the network, because git does not need to involve the remote side as much as rsync does. Rsync needs to enumerate local and remote files, to send only the difference. Git knows what’s on the remote side, and can compute the difference locally.
  
  However, I wonder if git still ends up being faster when you factor in the checkout/update on the receiving side. While I’ve been pleasantly surprised with the speed of git, I would like to see what happens if someone tries this thing with a multi-GB filesystem :-)
Daniel says:

February 21, 2012 at 10:43 am

This might be useful for the problem you gave above when you’ve got use something like Heroku and you have minor changes and don’t want to make a separate commit out of it.

git –amend –reuse-message=HEAD; git push deploy -f
Ole Laursen says:

March 9, 2012 at 9:57 am

What about “git pull”? I realize this is probably not going to be that simple when you have multiple hosts, but for a single host I always ssh to the machine I’m deploying to and pull the updates. That way, you’re also sitting on the machine in case you need to do a disaster recovery.
Joe Jasinski says:

May 18, 2012 at 4:27 pm

Hi, I was wondering what you recommend as a good way to track which commit and branch a release is made from. It might be useful to know which commit that the release is as of, so it’s clear which features are included. Is that what the build step is for (which could be used to write out the current branch and commit) ? Joe
- Ian Bicking says:
  
  May 30, 2012 at 10:56 am
  
  The script attempts to keep track of this by putting the upstream information in the commit message – it’ll put in a tag name if available, and a commit hash if not. I don’t think it has any branch information, but that could be added. It just uses git describe to get this information.
GitStack's team says:

May 28, 2012 at 12:41 pm

Hi,

I see you speak about git but do you know GitStack ? It is a very great tool to manage git server on Windows and other platforms. I let you know about it if you have a little time.

Otherwise, thanks for the script I’ll try later.

Have a good day,

– GitStack’s Team

Ian Bicking: a blog

Git-as-sync, not source-control-as-deployment

11 Comments

Home

About

Archives

Categories

Recent Posts

Recent Comments