Re: "Producting Open Source Software" book and distributed SCMs

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: "Producting Open Source Software" book and distributed SCMs
Date: Tue, 1 May 2007 11:35:54 +0200 (CEST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0705011057130.29859@racer.site> (raw)
In-Reply-To: <200704300120.42576.jnareb@gmail.com>

Hi,

On Mon, 30 Apr 2007, Jakub Narebski wrote:

> I have read lately classic book "Producing Open Source Software. How to 
> Run a Successful Free Software Project" by Karl Fogel (2005).
> 
> Among others, author advocates using version control system as a basis 
> for running a project. In "Choosing a Version Contol System" he writes:
> 
>   As of this writing, the version control system of choice in the free
>   software world is the Concurrent Versions System or CVS.

Back then, it was. I ran all my projects on CVS. Then came along Git. I 
tried to keep up with it, but had to quit for day-job reasons. When I came 
back, Git was already so good that I switched almost everything over.

> The distributed SCM is mentioned in footnote in section "Comitters" in 
> Chapter 8, Managing Volunteers:
> 
>  http://producingoss.com/producingoss.html#ftn.id284130
> 
>   [22] Note that the commit access means something a bit different in
>   decentralized version control systems, where anyone can set up a
>   repository that is linked into the project, and give themselves commit
>   access to that repository. Nevertheless, the concept of commit access
>   still applies: "commit access" is shorthand for "the right to make
>   changes to the code that will ship in the group's next release of the
>   software." In centralized version control systems, this means having
>   direct commit access; in decentralized ones, it means having one's
>   changes pulled into the main distribution by default. It is the same
>   idea either way; the mechanics by which it is realized are not
>   terribly important.
> 
> 
> I'm interested in your experience with managing projects using 
> distributed SCM, or even better first centralized then distributed SCM: 
> is the above difference the only one?

In my experience, the offline mode has been a huge advantage. For example, 
in one project I work together with people from three different countries, 
some of them traveling quite a bit. I sold Git solely on the 
transportability. One of them was so happy that he switched over most of 
his projects, too.

BTW that is the common way I see: once people get hooked, they not only 
convert their existing projects to Git, but they use cvsimport a lot more, 
and they start to manage configuration settings, documents, pictures, etc. 
with Git, because it gives rise an easy backup mechanism.

Another difference between central and distributed operation I see is the 
workflow. With Git, you can commit much more often. For example, when 
working with Sourceforge's CVS (which _was_ comparable with the speed of 
corporate SourceSafe repos), I would always think about committing (and 
having a coffee), or rather combine these changes with the next ones.

Obviously, committing more often leads to a much nicer repository 
structure, making it much easier to get into the code for new developers. 
It also makes it easier to get at bugs. And because it is so much faster, 
you can actually do a "git diff" before committing, to make sure that you 
did not leave in that stupid debug statement.

> Linus has said that fully distributed SCM improves forkability:
>
> [...] 
> 
>   I think that "forking" is what keeps people honest. The _biggest_
>   downside with CVS is actually that a central repository gets so much
>   _political_ clout, that it's effectively impossible to fork the
>   project: [...]
> 
> According to "Producting Open Source Software" it is very important 
> feature for an OSS project.
>
> [...]
> 
>   Because a fork is bad for everyone (for reasons examined in detail in 
>   the section called "Forks" in Chapter 8, Managing Volunteers, 
>   http://producingoss.com/producingoss.html#forks), the more serious the 
>   threat of a fork becomes, the more willing people are to compromise to 
>   avoid it.

This is a lousy argument, IMHO.

Why are forks bad? They are not. But if you "learnt" that merges are hard, 
they are.

It is a pity that so many people were trained in CVS, and keep thinking 
some of the lectures were true, when they are no longer.

Forks are good. In fact, we all "forked" with CVS as soon as we began 
hacking. Everybody who claims to never have started over from a fresh 
checkout, or from an "update -C"ed state, is probably lying, or a bad 
developer. Thinking about it, I believe that the difference between 
forking and branching is philosophical, not technical. You can always 
merge a fork.

And the thing is, you would not start hacking on some obscure feature, if 
that happened completely in the open, for fear of being accused a complete 
moron.

With CVS, that meant that you tried to get at a stage where others could 
see that it was worth doing, before committing. Which makes for monster 
commits. ("The number of bugs is the _square_ of the number of changed 
lines.") With Git, that problem is virtually not there.

> Besides that, what are the differences between managing project using 
> centralized SCM and one using distributed SCM? What is equivalent of 
> committers, giving full and partial commit access, revoking commit 
> access?

I have to admit that I drive one of my projects "CVS" style, with SSH 
accounts for all developers, who push into the same repo.

But that worked quite well up to now.

If I _had_ to restrict them, I'd probably do that by (temporarily) 
assigning a release engineer, and setting up some hook scripts in all 
repos. But I don't believe in restriction when it comes to creativity.

> How good support for tagging and branching influences creating 
> code and build procedure?

> Is distributed SCM better geared towards "benovolent dictator" model 
> than "consensus-based democracy" model, as described in OSSbook?

Not at all. I think the best example is kernel.org, where you find tons of 
forks. IMHO it is really helping the benevolent dictator cave into the 
consensus-based model, since forks can be preferred at any time. Hey, even 
switching from one to another upstream is just a git-pull away!

Ciao,
Dscho

next prev parent reply	other threads:[~2007-05-01  9:36 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-29 23:20 "Producting Open Source Software" book and distributed SCMs Jakub Narebski
2007-05-01  9:35 ` Johannes Schindelin [this message]
2007-05-01 15:23   ` Theodore Tso
2007-05-01 15:45     ` Johannes Schindelin
2007-05-01 18:30   ` Jakub Narebski
2007-05-01 23:13     ` Linus Torvalds
2007-05-01 16:15 ` Linus Torvalds
2007-05-01 22:27   ` Jakub Narebski
2007-05-01 22:45     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0705011057130.29859@racer.site \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).